{
 "cells": [
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "tags": [
     "remove-cell"
    ]
   },
   "outputs": [],
   "source": [
    "import sys\n",
    "import os\n",
    "if not any(path.endswith('textbook') for path in sys.path):\n",
    "    sys.path.append(os.path.abspath('../../..'))\n",
    "from textbook_utils import *"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "(sec:eda_feature_types)=\n",
    "# Feature Types\n",
    "\n",
    "Before making an exploratory plot, or any plot for that matter, it's a good idea to examine the feature (or features) and decide on its *feature type*. (Sometimes we refer to a feature as a _variable_ and its type as *variable type*.) Although there are multiple ways of categorizing feature types, in this book we consider three basic ones:"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "*Nominal*\n",
    ": A feature that represents \"named\" categories, where the categories do not have a natural ordering, is called nominal. Examples include political party affiliation (Democrat, Republican, Green, Other); dog type (herding, hound, non-sporting, sporting, terrier, toy, working); and computer operating system (Windows, macOS, Linux).\n",
    "\n",
    "*Ordinal*\n",
    ": Measurements that represent ordered categories are called ordinal. Examples of ordinal features are t-shirt size (small, medium, large); Likert-scale response (disagree, neutral, agree); and level of education (high school, college, graduate school). It is important to\n",
    "note that with an ordinal feature, the difference between, say, small and\n",
    "medium need not be the same as the difference between medium and large. Also, the differences between consecutive categories may not even be quantifiable. Think of the number of stars in a restaurant review\n",
    "and what one star means in comparison to two stars. \n",
    "\n",
    "Ordinal and nominal data are subtypes of *categorical* data. Another name\n",
    "for categorical data is *qualitative*. In contrast, we also have\n",
    "*quantitative* features:\n",
    "\n",
    "\n",
    "*Quantitative*\n",
    ": Data that represent numeric measurements or quantities\n",
    "are called quantitative. Examples include height measured to the\n",
    "nearest cm, price reported in USD, and distance measured to the nearest\n",
    "km.  Quantitative features can be further divided into\n",
    "*discrete*, meaning that only a few values of the feature are possible, and\n",
    "*continuous*, meaning that the quantity could in principle be measured to\n",
    "arbitrary precision. The number of siblings in a family takes on a discrete\n",
    "set of values (such as 0, 1, 2,..., 8). In contrast, height can theoretically be\n",
    "reported to any number of decimal places, so we consider it continuous.\n",
    "There is no hard and fast rule to determine whether a quantity is discrete or continuous. In some cases, it can be a judgment call, and in others, we may want to purposefully consider a continuous feature to be discrete. "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "A feature type is not the same thing as a data storage type. Each column in a  `pandas` dataframe has its own *storage type*. These types can be integer, floating point, boolean, date-time format, category, and object (strings of varying length are stored as objects in Python with pointers to the strings).\n",
    "We use the term *feature type* to refer to a\n",
    "conceptual notion of the information and the term *storage type* to refer to the\n",
    "representation of the information in the computer."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "A feature stored as an integer can represent nominal data, strings can be\n",
    "quantitative (like `\"\\$100.00\"`),  and, in practice, boolean values often\n",
    "represent nominal features that have only two possible values.\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    ":::{note}\n",
    "\n",
    "`pandas` calls the storage type `dtype`, which is short for data type.\n",
    "We refrain from using the term *data type* here because it can be confused with\n",
    "both storage type and feature type.\n",
    "\n",
    ":::"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In order to determine a feature type, we often need to consult a\n",
    "dataset’s *data dictionary* or *codebook*. A data dictionary is a document\n",
    "included with the data that describes what each column in the data table\n",
    "represents.  In the following example, we take a look at the storage and\n",
    "feature types of the columns in a dataframe about various dog breeds, \n",
    "and we find that the storage type is often not a good indicator of the kind \n",
    "of information contained in a field."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Example: Dog Breeds"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We use the [American Kennel Club (AKC)](https://www.akc.org) data on registered dog breeds to introduce the various concepts related to EDA. The AKC, a nonprofit that was founded in 1884, has the stated mission to \"advance the study, breeding, exhibiting, running and maintenance of purebred dogs.\"  The AKC organizes events like the National Championship, Agility Invitational, and Obedience Classic, and mixed-breed dogs are welcome to participate in most events. The [Information Is Beautiful](https://informationisbeautiful.net) website provides a dataset with information from the AKC on 172 breeds. Its visualization, [Best in Show](https://www.informationisbeautiful.net/visualizations/best-in-show-whats-the-top-data-dog/), incorporates many features of the breeds and is fun to look at."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The AKC dataset contains several different kinds of features, and we have extracted a handful of them that show a variety of types of information. These features include the name of the breed; its longevity, weight, and height; and other information such as its suitability for children and the number of repetitions needed to learn a new trick. Each record in the dataset is a breed of dog, and the information provided is meant to be typical of that breed."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's read the data into a dataframe:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>breed</th>\n",
       "      <th>group</th>\n",
       "      <th>score</th>\n",
       "      <th>longevity</th>\n",
       "      <th>...</th>\n",
       "      <th>size</th>\n",
       "      <th>weight</th>\n",
       "      <th>height</th>\n",
       "      <th>repetition</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Border Collie</td>\n",
       "      <td>herding</td>\n",
       "      <td>3.64</td>\n",
       "      <td>12.52</td>\n",
       "      <td>...</td>\n",
       "      <td>medium</td>\n",
       "      <td>NaN</td>\n",
       "      <td>51.0</td>\n",
       "      <td>&lt;5</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Border Terrier</td>\n",
       "      <td>terrier</td>\n",
       "      <td>3.61</td>\n",
       "      <td>14.00</td>\n",
       "      <td>...</td>\n",
       "      <td>small</td>\n",
       "      <td>6.0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>15-25</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>Brittany</td>\n",
       "      <td>sporting</td>\n",
       "      <td>3.54</td>\n",
       "      <td>12.92</td>\n",
       "      <td>...</td>\n",
       "      <td>medium</td>\n",
       "      <td>16.0</td>\n",
       "      <td>48.0</td>\n",
       "      <td>5-15</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>...</th>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>169</th>\n",
       "      <td>Wire Fox Terrier</td>\n",
       "      <td>terrier</td>\n",
       "      <td>NaN</td>\n",
       "      <td>13.17</td>\n",
       "      <td>...</td>\n",
       "      <td>small</td>\n",
       "      <td>8.0</td>\n",
       "      <td>38.0</td>\n",
       "      <td>25-40</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>170</th>\n",
       "      <td>Wirehaired Pointing Griffon</td>\n",
       "      <td>sporting</td>\n",
       "      <td>NaN</td>\n",
       "      <td>8.80</td>\n",
       "      <td>...</td>\n",
       "      <td>medium</td>\n",
       "      <td>NaN</td>\n",
       "      <td>56.0</td>\n",
       "      <td>25-40</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>171</th>\n",
       "      <td>Xoloitzcuintli</td>\n",
       "      <td>non-sporting</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>...</td>\n",
       "      <td>medium</td>\n",
       "      <td>NaN</td>\n",
       "      <td>42.0</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>172 rows × 12 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "                           breed         group  score  longevity  ...    size  \\\n",
       "0                  Border Collie       herding   3.64      12.52  ...  medium   \n",
       "1                 Border Terrier       terrier   3.61      14.00  ...   small   \n",
       "2                       Brittany      sporting   3.54      12.92  ...  medium   \n",
       "..                           ...           ...    ...        ...  ...     ...   \n",
       "169             Wire Fox Terrier       terrier    NaN      13.17  ...   small   \n",
       "170  Wirehaired Pointing Griffon      sporting    NaN       8.80  ...  medium   \n",
       "171               Xoloitzcuintli  non-sporting    NaN        NaN  ...  medium   \n",
       "\n",
       "     weight  height  repetition  \n",
       "0       NaN    51.0          <5  \n",
       "1       6.0     NaN       15-25  \n",
       "2      16.0    48.0        5-15  \n",
       "..      ...     ...         ...  \n",
       "169     8.0    38.0       25-40  \n",
       "170     NaN    56.0       25-40  \n",
       "171     NaN    42.0         NaN  \n",
       "\n",
       "[172 rows x 12 columns]"
      ]
     },
     "execution_count": 2,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "dogs = pd.read_csv('data/akc.csv')\n",
    "dogs"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "A cursory glance at the table shows us that breed, group, and size appear to be\n",
    "strings, and the other columns numbers. The summary of the dataframe, shown\n",
    "here, provides the index, name, count of non-null values, and `dtype` for each column:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 36,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "<class 'pandas.core.frame.DataFrame'>\n",
      "RangeIndex: 172 entries, 0 to 171\n",
      "Data columns (total 12 columns):\n",
      " #   Column          Non-Null Count  Dtype  \n",
      "---  ------          --------------  -----  \n",
      " 0   breed           172 non-null    object \n",
      " 1   group           172 non-null    object \n",
      " 2   score           87 non-null     float64\n",
      " 3   longevity       135 non-null    float64\n",
      " 4   ailments        148 non-null    float64\n",
      " 5   purchase_price  146 non-null    float64\n",
      " 6   grooming        112 non-null    float64\n",
      " 7   children        112 non-null    float64\n",
      " 8   size            172 non-null    object \n",
      " 9   weight          86 non-null     float64\n",
      " 10  height          159 non-null    float64\n",
      " 11  repetition      132 non-null    object \n",
      "dtypes: float64(8), object(4)\n",
      "memory usage: 16.2+ KB\n"
     ]
    }
   ],
   "source": [
    "dogs.info()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Several columns of this dataframe have a numeric computational type, as\n",
    "signified by  `float64`, which means that the column can contain numbers other than integers.\n",
    "We also confirm that `pandas` encodes the string columns as the `object` `dtype`, rather than a `string` `dtype`.\n",
    "Notice that we guessed incorrectly that `repetition` is quantitative.\n",
    "Looking a bit more carefully at the data table, we see that `repetition` contains string values for ranges,\n",
    "such as `\"<5\"`, `\"15-25\"`, and `\"25-40\"`,  so this feature is ordinal."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    ":::{note}\n",
    "\n",
    "In computer architecture, a floating-point number, or \"float\" for short, \n",
    "refers to a number that can have a decimal component. We won't go in depth\n",
    "into computer architecture in this book, but we will point it out when it\n",
    "affects terminology, as in this case.\n",
    "The `dtype` `float64` says that the column contains decimal numbers that each\n",
    "take up 64 bits of space when stored in computer memory.\n",
    "\n",
    "Additionally, `pandas` uses optimized storage types for numeric data, like `float64` or `int64`.\n",
    "However, it doesn't have optimizations for Python objects like strings,\n",
    "dictionaries, or sets, so these are all stored as the `object` `dtype`.\n",
    "This means that the storage type is ambiguous, but in most settings\n",
    "we know whether `object` columns contain strings or some other Python type.\n",
    "\n",
    ":::"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Looking at the column storage types, we might guess `ailments` and `children` are quantitative\n",
    "features because they are stored as `float64` `dtype`s. \n",
    "But let's tally their unique values:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 37,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "ailments\n",
       "0.0    61\n",
       "1.0    42\n",
       "2.0    24\n",
       "4.0    10\n",
       "3.0     6\n",
       "5.0     3\n",
       "9.0     1\n",
       "8.0     1\n",
       "Name: count, dtype: int64"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "display_df(dogs['ailments'].value_counts(), rows=8)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 38,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "children\n",
       "1.0    67\n",
       "2.0    35\n",
       "3.0    10\n",
       "Name: count, dtype: int64"
      ]
     },
     "execution_count": 38,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "dogs['children'].value_counts()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Both `ailments` and `children` only take on a few integer values.\n",
    "What does a value\n",
    "of `3.0` for `children` or `9.0` for `ailments` mean? We need more information to\n",
    "figure this out. The name of the column and how the information is stored in\n",
    "the dataframe is not enough.\n",
    "Instead, we consult the data dictionary shown in {numref}`Table %s <akc-codebook>`."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    ":::{table} AKC dog breed codebook\n",
    ":name: akc-codebook\n",
    "\n",
    "| Feature        | Description                                                                                   |\n",
    "|----------------|-----------------------------------------------------------------------------------------------|\n",
    "| `breed`          | Dog breed, e.g., Border Collie, Dalmatian, Vizsla                                             |\n",
    "| `group`          | American Kennel Club grouping (herding, hound, non-sporting, sporting, terrier, toy, working) |\n",
    "| `score`          | AKC score                                                                                     |\n",
    "| `longevity`      | Typical lifetime (years)                                                                      |\n",
    "| `ailments`       | Number of serious genetic ailments                                                            |\n",
    "| `purchase_price` | Average purchase price from puppyfind.com                                                     |\n",
    "| `grooming`       | Grooming required once every: 1 = day, 2 = week, 3 = few weeks                                |\n",
    "| `children`       | Suitability for children: 1 = high, 2 = medium, 3 = low                                       |\n",
    "| `size`           | Size: small, medium, large                                                                    |\n",
    "| `weight`         | Typical weight (kg)                                                                           |\n",
    "| `height`         | Typical height from the shoulder (cm)                                                         |\n",
    "| `repetition`     | Number of repetitions to understand a new command: <5, 5–15, 15–25, 25–40, 40–80, >80         |\n",
    "\n",
    ":::"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Although the data dictionary does not explicitly specify the feature types, the\n",
    "description is enough for us to figure out that the feature `children` represents the suitability of\n",
    "the breed for children, and a value of `1.0` corresponds to \"high\" suitability.\n",
    "We also find that the feature `ailments` is a count of the number of serious genetic\n",
    "ailments that dogs of this breed tend to have.  Based on the codebook, we treat\n",
    "`children` as a categorical feature, even though it is stored as a floating-point number, and since low < medium < high, the feature is ordinal.  Since\n",
    "`ailments` is a count, we treat it as a quantitative (numeric) type,\n",
    "and for some analyses we further define it as discrete because there\n",
    "are only a few possible values that `ailments` can take on."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The codebook also confirms that the features `score`, `longevity`,\n",
    "`purchase_price`, `weight`, and `height` are quantitative. \n",
    "The idea here is that numeric features have values that can be compared through differences. \n",
    "It makes sense to say that chihuahuas typically live about four years longer than dachshunds (16.5 versus 12.6 years). Another check is whether it makes sense to compare ratios of values:\n",
    "a dachshund is usually about five times heavier than a chihuahua (11 kg versus 2 kg).\n",
    "All of these quantitative features are continuous; only `ailments` is discrete."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The data dictionary descriptions for `breed`, `group`, `size`, and `repetition`\n",
    "suggest that these features are qualitative. Each variable has\n",
    "different, and yet commonly found, characteristics that are worth exploring a\n",
    "bit more. We do this by examining the counts of each unique value for the various\n",
    "features. We begin with `breed`:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 39,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "breed\n",
       "Border Collie       1\n",
       "Great Pyrenees      1\n",
       "English Foxhound    1\n",
       "                   ..\n",
       "Saluki              1\n",
       "Giant Schnauzer     1\n",
       "Xoloitzcuintli      1\n",
       "Name: count, Length: 172, dtype: int64"
      ]
     },
     "execution_count": 39,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "dogs['breed'].value_counts()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The `breed` feature has 172 unique values---that's the same as the number of\n",
    "records in the dataframe---so we can think of `breed` as the *primary key* for the data table. \n",
    "By design, each dog breed has one record, and this `breed` feature determines\n",
    "the dataset's granularity.  Although `breed` is also considered a nominal feature,\n",
    "it doesn't really make sense to analyze it. We do want to confirm that all\n",
    "values are unique and clean, but otherwise we would only use it to, say, label\n",
    "unusual values in a plot."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Next, we examine the feature `group`:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 40,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "group\n",
       "terrier         28\n",
       "sporting        28\n",
       "working         27\n",
       "hound           26\n",
       "herding         25\n",
       "toy             19\n",
       "non-sporting    19\n",
       "Name: count, dtype: int64"
      ]
     },
     "execution_count": 40,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "dogs['group'].value_counts()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This feature has seven unique values.\n",
    "Since a dog breed labeled as \"sporting\" and another considered to be \"toy\"\n",
    "differ from each other in several ways, the categories cannot be easily reduced to an ordering.\n",
    "So we consider `group` a nominal feature. \n",
    "Nominal features do not provide meaning in even the direction of the differences. "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Next, we examine the unique values and their counts for `size`: "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 41,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "size\n",
       "medium    60\n",
       "small     58\n",
       "large     54\n",
       "Name: count, dtype: int64"
      ]
     },
     "execution_count": 41,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "dogs['size'].value_counts()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The `size` feature has a natural ordering: small < medium < large, so it is\n",
    "ordinal.  We don't know how the category \"small\" is determined, but we do know\n",
    "that a small breed is in some sense smaller than a medium-sized breed, which is\n",
    "smaller than a large one.  We have an ordering, but differences and ratios\n",
    "don't make sense conceptually for this feature."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The `repetition` feature is an example of a quantitative variable that has been\n",
    "collapsed into categories to become ordinal. The codebook tells us that\n",
    "`repetition` is the number of times a new command needs to be repeated before\n",
    "the dog understands it:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 42,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "repetition\n",
       "25-40     39\n",
       "15-25     29\n",
       "40-80     22\n",
       "5-15      21\n",
       "80-100    11\n",
       "<5        10\n",
       "Name: count, dtype: int64"
      ]
     },
     "execution_count": 42,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "dogs['repetition'].value_counts()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The numeric values have been lumped together as\n",
    "`<5`, `5-15`, `15-25`, `25-40`, `40-80`, `80-100`, and notice that these categories have different widths. The first has 5 repetitions, while others are 10, 15, and 40 repetitions wide.  The ordering is\n",
    "clear, but the gaps from one category to the next are not of the same magnitude."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": []
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now that we have double-checked the values in the variables against the\n",
    "descriptions in the codebook, we can augment the data dictionary to include\n",
    "this additional information about the feature types.\n",
    "Our revised dictionary appears in {numref}`Table %s <revised-akc-codebook>`."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    ":::{table} Revised AKC dog breed codebook\n",
    ":name: revised-akc-codebook\n",
    "\n",
    "| Feature        |  Description                                                                                |  Feature type             |  Storage type\n",
    "|:----------------|:-------------------------------------------------------------------------------------|:---------------------------|:----------------\n",
    "`breed`           |  Dog breed, e.g., Border Collie, Dalmatian, Vizsla                                          |  primary key              |  string\n",
    "`group`           |  AKC group (herding, hound, non-sporting, sporting, terrier, toy, working)                  |  qualitative - nominal    |  string\n",
    "score           |  AKC `score`                                                                                  |  quantitative             |  floating point\n",
    "`longevity`       |  Typical lifetime (years)                                                                   |  quantitative             |  floating point\n",
    "`ailments`        |  Number of serious genetic ailments (0, 1, ..., 9)                                          |  quantitative - discrete  |  floating point\n",
    "`purchase_price`  |  Average purchase price from puppyfind.com                                                  |  quantitative             |  floating point\n",
    "`grooming`        |  Groom once every: 1 = day, 2 = week, 3 = few weeks                                         |  qualitative - ordinal    |  floating point\n",
    "`children`        |  Suitability for children: 1 = high, 2 = medium, 3 = low                                    |  qualitative - ordinal    |  floating point\n",
    "`size`            |  Size: small, medium, large                                                                 |  qualitative - ordinal    |  string\n",
    "`weight`          |  Typical weight (kg)                                                                        |  quantitative             |  floating point\n",
    "`height`          |  Typical height from the shoulder (cm)                                                      |  quantitative             |  floating point\n",
    "`repetition`      |  Number of repetitions to understand a new command:     <5, 5–15, 15–25, 25–40, 40–80, 80–100  |  Qualitative - ordinal    |  string\n",
    "\n",
    ":::"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This sharper understanding of the feature types of the AKC data helps us make quality checks and transformations.\n",
    "We discussed transformations in {numref}`Chapter %s <ch:wrangling>`, but there are a\n",
    "few additional transformations that were not covered. These pertain to categories of qualitative features, and we \n",
    "describe them next."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Transforming Qualitative Features"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Whether a feature is nominal or ordinal, we may find it useful to relabel categories so that they are more informative; collapse categories to simplify a visualization; and even convert a numeric feature to ordinal to focus on particular transition points. We explain when we may want to make each of these transformations and give examples."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Relabel categories\n",
    "\n",
    "Summary statistics, like the mean and the median, make\n",
    "sense for quantitative data, but typically not for qualitative data.  For\n",
    "example, the average price for toy breeds makes sense to calculate (\\$687), but\n",
    "the \"average\" breed suitability for children doesn't.\n",
    "However, `pandas` will happily compute the mean of the values in the `children`\n",
    "column if we ask it to:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 43,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "1.4910714285714286"
      ]
     },
     "execution_count": 43,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Don't use this value in actual data analysis!\n",
    "dogs[\"children\"].mean()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "tags": []
   },
   "source": [
    "Instead, we want to consider the distribution of ones, twos, and threes of\n",
    "the `children`. "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    ":::{note}\n",
    "\n",
    "The key difference between storage types and feature types is that storage\n",
    "types say what operations we can write code to *compute*, while\n",
    "feature types say what operations *make sense for the data*.\n",
    "\n",
    ":::"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We can transform `children` by replacing the numbers with their string\n",
    "descriptions.  Changing 1, 2, and 3 into high, medium, and low makes\n",
    "it easier to  recognize that  `children` is categorical. With strings, we would\n",
    "not be tempted to compute a mean, the categories would be connected to their\n",
    "meaning, and labels for plots would have reasonable values by default.\n",
    "For example, let's focus on just the toy breeds and make a bar plot of suitability for children. First, we create a new column with the categories of suitability as strings:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [],
   "source": [
    "kids = {1:\"high\", 2:\"medium\", 3:\"low\"}\n",
    "dogs = dogs.assign(kids=dogs['children'].replace(kids))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>breed</th>\n",
       "      <th>group</th>\n",
       "      <th>score</th>\n",
       "      <th>longevity</th>\n",
       "      <th>...</th>\n",
       "      <th>weight</th>\n",
       "      <th>height</th>\n",
       "      <th>repetition</th>\n",
       "      <th>kids</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Border Collie</td>\n",
       "      <td>herding</td>\n",
       "      <td>3.64</td>\n",
       "      <td>12.52</td>\n",
       "      <td>...</td>\n",
       "      <td>NaN</td>\n",
       "      <td>51.0</td>\n",
       "      <td>&lt;5</td>\n",
       "      <td>low</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Border Terrier</td>\n",
       "      <td>terrier</td>\n",
       "      <td>3.61</td>\n",
       "      <td>14.00</td>\n",
       "      <td>...</td>\n",
       "      <td>6.0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>15-25</td>\n",
       "      <td>high</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>Brittany</td>\n",
       "      <td>sporting</td>\n",
       "      <td>3.54</td>\n",
       "      <td>12.92</td>\n",
       "      <td>...</td>\n",
       "      <td>16.0</td>\n",
       "      <td>48.0</td>\n",
       "      <td>5-15</td>\n",
       "      <td>medium</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>...</th>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>169</th>\n",
       "      <td>Wire Fox Terrier</td>\n",
       "      <td>terrier</td>\n",
       "      <td>NaN</td>\n",
       "      <td>13.17</td>\n",
       "      <td>...</td>\n",
       "      <td>8.0</td>\n",
       "      <td>38.0</td>\n",
       "      <td>25-40</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>170</th>\n",
       "      <td>Wirehaired Pointing Griffon</td>\n",
       "      <td>sporting</td>\n",
       "      <td>NaN</td>\n",
       "      <td>8.80</td>\n",
       "      <td>...</td>\n",
       "      <td>NaN</td>\n",
       "      <td>56.0</td>\n",
       "      <td>25-40</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>171</th>\n",
       "      <td>Xoloitzcuintli</td>\n",
       "      <td>non-sporting</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>...</td>\n",
       "      <td>NaN</td>\n",
       "      <td>42.0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>172 rows × 13 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "                           breed         group  score  longevity  ...  weight  \\\n",
       "0                  Border Collie       herding   3.64      12.52  ...     NaN   \n",
       "1                 Border Terrier       terrier   3.61      14.00  ...     6.0   \n",
       "2                       Brittany      sporting   3.54      12.92  ...    16.0   \n",
       "..                           ...           ...    ...        ...  ...     ...   \n",
       "169             Wire Fox Terrier       terrier    NaN      13.17  ...     8.0   \n",
       "170  Wirehaired Pointing Griffon      sporting    NaN       8.80  ...     NaN   \n",
       "171               Xoloitzcuintli  non-sporting    NaN        NaN  ...     NaN   \n",
       "\n",
       "     height  repetition    kids  \n",
       "0      51.0          <5     low  \n",
       "1       NaN       15-25    high  \n",
       "2      48.0        5-15  medium  \n",
       "..      ...         ...     ...  \n",
       "169    38.0       25-40     NaN  \n",
       "170    56.0       25-40     NaN  \n",
       "171    42.0         NaN     NaN  \n",
       "\n",
       "[172 rows x 13 columns]"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "dogs"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Then we can make the bar plot of counts of each category of suitability among the toy breeds:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "application/vnd.plotly.v1+json": {
       "config": {
        "plotlyServerURL": "https://plot.ly"
       },
       "data": [
        {
         "alignmentgroup": "True",
         "hovertemplate": "Suitability for children=%{x}<br>count=%{y}<extra></extra>",
         "legendgroup": "",
         "marker": {
          "color": "#1F77B4",
          "pattern": {
           "shape": ""
          }
         },
         "name": "",
         "offsetgroup": "",
         "orientation": "v",
         "showlegend": false,
         "textposition": "auto",
         "type": "bar",
         "x": [
          "high",
          "low",
          "medium"
         ],
         "xaxis": "x",
         "y": [
          3,
          5,
          5
         ],
         "yaxis": "y"
        }
       ],
       "layout": {
        "barmode": "relative",
        "height": 250,
        "legend": {
         "tracegroupgap": 0
        },
        "template": {
         "data": {
          "bar": [
           {
            "error_x": {
             "color": "rgb(36,36,36)"
            },
            "error_y": {
             "color": "rgb(36,36,36)"
            },
            "marker": {
             "line": {
              "color": "white",
              "width": 0.5
             },
             "pattern": {
              "fillmode": "overlay",
              "size": 10,
              "solidity": 0.2
             }
            },
            "type": "bar"
           }
          ],
          "barpolar": [
           {
            "marker": {
             "line": {
              "color": "white",
              "width": 0.5
             },
             "pattern": {
              "fillmode": "overlay",
              "size": 10,
              "solidity": 0.2
             }
            },
            "type": "barpolar"
           }
          ],
          "carpet": [
           {
            "aaxis": {
             "endlinecolor": "rgb(36,36,36)",
             "gridcolor": "white",
             "linecolor": "white",
             "minorgridcolor": "white",
             "startlinecolor": "rgb(36,36,36)"
            },
            "baxis": {
             "endlinecolor": "rgb(36,36,36)",
             "gridcolor": "white",
             "linecolor": "white",
             "minorgridcolor": "white",
             "startlinecolor": "rgb(36,36,36)"
            },
            "type": "carpet"
           }
          ],
          "choropleth": [
           {
            "colorbar": {
             "outlinewidth": 1,
             "tickcolor": "rgb(36,36,36)",
             "ticks": "outside"
            },
            "type": "choropleth"
           }
          ],
          "contour": [
           {
            "colorbar": {
             "outlinewidth": 1,
             "tickcolor": "rgb(36,36,36)",
             "ticks": "outside"
            },
            "colorscale": [
             [
              0,
              "#440154"
             ],
             [
              0.1111111111111111,
              "#482878"
             ],
             [
              0.2222222222222222,
              "#3e4989"
             ],
             [
              0.3333333333333333,
              "#31688e"
             ],
             [
              0.4444444444444444,
              "#26828e"
             ],
             [
              0.5555555555555556,
              "#1f9e89"
             ],
             [
              0.6666666666666666,
              "#35b779"
             ],
             [
              0.7777777777777778,
              "#6ece58"
             ],
             [
              0.8888888888888888,
              "#b5de2b"
             ],
             [
              1,
              "#fde725"
             ]
            ],
            "type": "contour"
           }
          ],
          "contourcarpet": [
           {
            "colorbar": {
             "outlinewidth": 1,
             "tickcolor": "rgb(36,36,36)",
             "ticks": "outside"
            },
            "type": "contourcarpet"
           }
          ],
          "heatmap": [
           {
            "colorbar": {
             "outlinewidth": 1,
             "tickcolor": "rgb(36,36,36)",
             "ticks": "outside"
            },
            "colorscale": [
             [
              0,
              "#440154"
             ],
             [
              0.1111111111111111,
              "#482878"
             ],
             [
              0.2222222222222222,
              "#3e4989"
             ],
             [
              0.3333333333333333,
              "#31688e"
             ],
             [
              0.4444444444444444,
              "#26828e"
             ],
             [
              0.5555555555555556,
              "#1f9e89"
             ],
             [
              0.6666666666666666,
              "#35b779"
             ],
             [
              0.7777777777777778,
              "#6ece58"
             ],
             [
              0.8888888888888888,
              "#b5de2b"
             ],
             [
              1,
              "#fde725"
             ]
            ],
            "type": "heatmap"
           }
          ],
          "heatmapgl": [
           {
            "colorbar": {
             "outlinewidth": 1,
             "tickcolor": "rgb(36,36,36)",
             "ticks": "outside"
            },
            "colorscale": [
             [
              0,
              "#440154"
             ],
             [
              0.1111111111111111,
              "#482878"
             ],
             [
              0.2222222222222222,
              "#3e4989"
             ],
             [
              0.3333333333333333,
              "#31688e"
             ],
             [
              0.4444444444444444,
              "#26828e"
             ],
             [
              0.5555555555555556,
              "#1f9e89"
             ],
             [
              0.6666666666666666,
              "#35b779"
             ],
             [
              0.7777777777777778,
              "#6ece58"
             ],
             [
              0.8888888888888888,
              "#b5de2b"
             ],
             [
              1,
              "#fde725"
             ]
            ],
            "type": "heatmapgl"
           }
          ],
          "histogram": [
           {
            "marker": {
             "line": {
              "color": "white",
              "width": 0.6
             }
            },
            "type": "histogram"
           }
          ],
          "histogram2d": [
           {
            "colorbar": {
             "outlinewidth": 1,
             "tickcolor": "rgb(36,36,36)",
             "ticks": "outside"
            },
            "colorscale": [
             [
              0,
              "#440154"
             ],
             [
              0.1111111111111111,
              "#482878"
             ],
             [
              0.2222222222222222,
              "#3e4989"
             ],
             [
              0.3333333333333333,
              "#31688e"
             ],
             [
              0.4444444444444444,
              "#26828e"
             ],
             [
              0.5555555555555556,
              "#1f9e89"
             ],
             [
              0.6666666666666666,
              "#35b779"
             ],
             [
              0.7777777777777778,
              "#6ece58"
             ],
             [
              0.8888888888888888,
              "#b5de2b"
             ],
             [
              1,
              "#fde725"
             ]
            ],
            "type": "histogram2d"
           }
          ],
          "histogram2dcontour": [
           {
            "colorbar": {
             "outlinewidth": 1,
             "tickcolor": "rgb(36,36,36)",
             "ticks": "outside"
            },
            "colorscale": [
             [
              0,
              "#440154"
             ],
             [
              0.1111111111111111,
              "#482878"
             ],
             [
              0.2222222222222222,
              "#3e4989"
             ],
             [
              0.3333333333333333,
              "#31688e"
             ],
             [
              0.4444444444444444,
              "#26828e"
             ],
             [
              0.5555555555555556,
              "#1f9e89"
             ],
             [
              0.6666666666666666,
              "#35b779"
             ],
             [
              0.7777777777777778,
              "#6ece58"
             ],
             [
              0.8888888888888888,
              "#b5de2b"
             ],
             [
              1,
              "#fde725"
             ]
            ],
            "type": "histogram2dcontour"
           }
          ],
          "mesh3d": [
           {
            "colorbar": {
             "outlinewidth": 1,
             "tickcolor": "rgb(36,36,36)",
             "ticks": "outside"
            },
            "type": "mesh3d"
           }
          ],
          "parcoords": [
           {
            "line": {
             "colorbar": {
              "outlinewidth": 1,
              "tickcolor": "rgb(36,36,36)",
              "ticks": "outside"
             }
            },
            "type": "parcoords"
           }
          ],
          "pie": [
           {
            "automargin": true,
            "type": "pie"
           }
          ],
          "scatter": [
           {
            "marker": {
             "colorbar": {
              "outlinewidth": 1,
              "tickcolor": "rgb(36,36,36)",
              "ticks": "outside"
             }
            },
            "type": "scatter"
           }
          ],
          "scatter3d": [
           {
            "line": {
             "colorbar": {
              "outlinewidth": 1,
              "tickcolor": "rgb(36,36,36)",
              "ticks": "outside"
             }
            },
            "marker": {
             "colorbar": {
              "outlinewidth": 1,
              "tickcolor": "rgb(36,36,36)",
              "ticks": "outside"
             }
            },
            "type": "scatter3d"
           }
          ],
          "scattercarpet": [
           {
            "marker": {
             "colorbar": {
              "outlinewidth": 1,
              "tickcolor": "rgb(36,36,36)",
              "ticks": "outside"
             }
            },
            "type": "scattercarpet"
           }
          ],
          "scattergeo": [
           {
            "marker": {
             "colorbar": {
              "outlinewidth": 1,
              "tickcolor": "rgb(36,36,36)",
              "ticks": "outside"
             }
            },
            "type": "scattergeo"
           }
          ],
          "scattergl": [
           {
            "marker": {
             "colorbar": {
              "outlinewidth": 1,
              "tickcolor": "rgb(36,36,36)",
              "ticks": "outside"
             }
            },
            "type": "scattergl"
           }
          ],
          "scattermapbox": [
           {
            "marker": {
             "colorbar": {
              "outlinewidth": 1,
              "tickcolor": "rgb(36,36,36)",
              "ticks": "outside"
             }
            },
            "type": "scattermapbox"
           }
          ],
          "scatterpolar": [
           {
            "marker": {
             "colorbar": {
              "outlinewidth": 1,
              "tickcolor": "rgb(36,36,36)",
              "ticks": "outside"
             }
            },
            "type": "scatterpolar"
           }
          ],
          "scatterpolargl": [
           {
            "marker": {
             "colorbar": {
              "outlinewidth": 1,
              "tickcolor": "rgb(36,36,36)",
              "ticks": "outside"
             }
            },
            "type": "scatterpolargl"
           }
          ],
          "scatterternary": [
           {
            "marker": {
             "colorbar": {
              "outlinewidth": 1,
              "tickcolor": "rgb(36,36,36)",
              "ticks": "outside"
             }
            },
            "type": "scatterternary"
           }
          ],
          "surface": [
           {
            "colorbar": {
             "outlinewidth": 1,
             "tickcolor": "rgb(36,36,36)",
             "ticks": "outside"
            },
            "colorscale": [
             [
              0,
              "#440154"
             ],
             [
              0.1111111111111111,
              "#482878"
             ],
             [
              0.2222222222222222,
              "#3e4989"
             ],
             [
              0.3333333333333333,
              "#31688e"
             ],
             [
              0.4444444444444444,
              "#26828e"
             ],
             [
              0.5555555555555556,
              "#1f9e89"
             ],
             [
              0.6666666666666666,
              "#35b779"
             ],
             [
              0.7777777777777778,
              "#6ece58"
             ],
             [
              0.8888888888888888,
              "#b5de2b"
             ],
             [
              1,
              "#fde725"
             ]
            ],
            "type": "surface"
           }
          ],
          "table": [
           {
            "cells": {
             "fill": {
              "color": "rgb(237,237,237)"
             },
             "line": {
              "color": "white"
             }
            },
            "header": {
             "fill": {
              "color": "rgb(217,217,217)"
             },
             "line": {
              "color": "white"
             }
            },
            "type": "table"
           }
          ]
         },
         "layout": {
          "annotationdefaults": {
           "arrowhead": 0,
           "arrowwidth": 1
          },
          "autosize": true,
          "autotypenumbers": "strict",
          "coloraxis": {
           "colorbar": {
            "outlinewidth": 1,
            "tickcolor": "rgb(36,36,36)",
            "ticks": "outside"
           }
          },
          "colorscale": {
           "diverging": [
            [
             0,
             "rgb(103,0,31)"
            ],
            [
             0.1,
             "rgb(178,24,43)"
            ],
            [
             0.2,
             "rgb(214,96,77)"
            ],
            [
             0.3,
             "rgb(244,165,130)"
            ],
            [
             0.4,
             "rgb(253,219,199)"
            ],
            [
             0.5,
             "rgb(247,247,247)"
            ],
            [
             0.6,
             "rgb(209,229,240)"
            ],
            [
             0.7,
             "rgb(146,197,222)"
            ],
            [
             0.8,
             "rgb(67,147,195)"
            ],
            [
             0.9,
             "rgb(33,102,172)"
            ],
            [
             1,
             "rgb(5,48,97)"
            ]
           ],
           "sequential": [
            [
             0,
             "#440154"
            ],
            [
             0.1111111111111111,
             "#482878"
            ],
            [
             0.2222222222222222,
             "#3e4989"
            ],
            [
             0.3333333333333333,
             "#31688e"
            ],
            [
             0.4444444444444444,
             "#26828e"
            ],
            [
             0.5555555555555556,
             "#1f9e89"
            ],
            [
             0.6666666666666666,
             "#35b779"
            ],
            [
             0.7777777777777778,
             "#6ece58"
            ],
            [
             0.8888888888888888,
             "#b5de2b"
            ],
            [
             1,
             "#fde725"
            ]
           ],
           "sequentialminus": [
            [
             0,
             "#440154"
            ],
            [
             0.1111111111111111,
             "#482878"
            ],
            [
             0.2222222222222222,
             "#3e4989"
            ],
            [
             0.3333333333333333,
             "#31688e"
            ],
            [
             0.4444444444444444,
             "#26828e"
            ],
            [
             0.5555555555555556,
             "#1f9e89"
            ],
            [
             0.6666666666666666,
             "#35b779"
            ],
            [
             0.7777777777777778,
             "#6ece58"
            ],
            [
             0.8888888888888888,
             "#b5de2b"
            ],
            [
             1,
             "#fde725"
            ]
           ]
          },
          "colorway": [
           "#1F77B4",
           "#FF7F0E",
           "#2CA02C",
           "#D62728",
           "#9467BD",
           "#8C564B",
           "#E377C2",
           "#7F7F7F",
           "#BCBD22",
           "#17BECF"
          ],
          "font": {
           "color": "rgb(36,36,36)"
          },
          "geo": {
           "bgcolor": "white",
           "lakecolor": "white",
           "landcolor": "white",
           "showlakes": true,
           "showland": true,
           "subunitcolor": "white"
          },
          "height": 250,
          "hoverlabel": {
           "align": "left"
          },
          "hovermode": "closest",
          "mapbox": {
           "style": "light"
          },
          "margin": {
           "b": 10,
           "l": 10,
           "r": 10,
           "t": 10
          },
          "paper_bgcolor": "white",
          "plot_bgcolor": "white",
          "polar": {
           "angularaxis": {
            "gridcolor": "rgb(232,232,232)",
            "linecolor": "rgb(36,36,36)",
            "showgrid": false,
            "showline": true,
            "ticks": "outside"
           },
           "bgcolor": "white",
           "radialaxis": {
            "gridcolor": "rgb(232,232,232)",
            "linecolor": "rgb(36,36,36)",
            "showgrid": false,
            "showline": true,
            "ticks": "outside"
           }
          },
          "scene": {
           "xaxis": {
            "backgroundcolor": "white",
            "gridcolor": "rgb(232,232,232)",
            "gridwidth": 2,
            "linecolor": "rgb(36,36,36)",
            "showbackground": true,
            "showgrid": false,
            "showline": true,
            "ticks": "outside",
            "zeroline": false,
            "zerolinecolor": "rgb(36,36,36)"
           },
           "yaxis": {
            "backgroundcolor": "white",
            "gridcolor": "rgb(232,232,232)",
            "gridwidth": 2,
            "linecolor": "rgb(36,36,36)",
            "showbackground": true,
            "showgrid": false,
            "showline": true,
            "ticks": "outside",
            "zeroline": false,
            "zerolinecolor": "rgb(36,36,36)"
           },
           "zaxis": {
            "backgroundcolor": "white",
            "gridcolor": "rgb(232,232,232)",
            "gridwidth": 2,
            "linecolor": "rgb(36,36,36)",
            "showbackground": true,
            "showgrid": false,
            "showline": true,
            "ticks": "outside",
            "zeroline": false,
            "zerolinecolor": "rgb(36,36,36)"
           }
          },
          "shapedefaults": {
           "fillcolor": "black",
           "line": {
            "width": 0
           },
           "opacity": 0.3
          },
          "ternary": {
           "aaxis": {
            "gridcolor": "rgb(232,232,232)",
            "linecolor": "rgb(36,36,36)",
            "showgrid": false,
            "showline": true,
            "ticks": "outside"
           },
           "baxis": {
            "gridcolor": "rgb(232,232,232)",
            "linecolor": "rgb(36,36,36)",
            "showgrid": false,
            "showline": true,
            "ticks": "outside"
           },
           "bgcolor": "white",
           "caxis": {
            "gridcolor": "rgb(232,232,232)",
            "linecolor": "rgb(36,36,36)",
            "showgrid": false,
            "showline": true,
            "ticks": "outside"
           }
          },
          "title": {
           "x": 0.5,
           "xanchor": "center"
          },
          "width": 350,
          "xaxis": {
           "automargin": true,
           "gridcolor": "rgb(232,232,232)",
           "linecolor": "rgb(36,36,36)",
           "showgrid": true,
           "showline": true,
           "ticks": "outside",
           "title": {
            "standoff": 15
           },
           "zeroline": false,
           "zerolinecolor": "rgb(36,36,36)"
          },
          "yaxis": {
           "automargin": true,
           "gridcolor": "rgb(232,232,232)",
           "linecolor": "rgb(36,36,36)",
           "showgrid": true,
           "showline": true,
           "ticks": "outside",
           "title": {
            "standoff": 15
           },
           "zeroline": false,
           "zerolinecolor": "rgb(36,36,36)"
          }
         }
        },
        "width": 350,
        "xaxis": {
         "anchor": "y",
         "autorange": true,
         "categoryarray": [
          "low",
          "medium",
          "high"
         ],
         "categoryorder": "array",
         "domain": [
          0,
          1
         ],
         "range": [
          -0.5,
          2.5
         ],
         "title": {
          "text": "Suitability for children"
         },
         "type": "category"
        },
        "yaxis": {
         "anchor": "x",
         "autorange": true,
         "domain": [
          0,
          1
         ],
         "range": [
          0,
          5.2631578947368425
         ],
         "title": {
          "text": "count"
         },
         "type": "linear"
        }
       }
      },
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAl0AAAFoCAYAAABt3U6oAAAAAXNSR0IArs4c6QAAIABJREFUeF7tnQmUVcW1sHfTIFMYlNFZVJ7zCImKgpqoicboUxSnOERFHIiKiUHxGVGC4BAFjPoEjaj4i/ocUBkdAoJiXpjECcSIioqKoswgQr+1z/L23zRN96X61qld535nLdfLo09V7fr27r7frap7bklZWVmZcEEAAhCAAAQgAAEIeCVQgnR55UvnEIAABCAAAQhAICGAdFEIEIAABCAAAQhAIAUCSFcKkBkCAhCAAAQgAAEIIF3UAAQgAAEIQAACEEiBANKVAmSGgAAEIAABCEAAAkgXNQABCEAAAhCAAARSIIB0pQCZISAAAQhAAAIQgADSRQ1AAAIQgAAEIACBFAggXSlAZggIQAACEIAABCCQCelq3769zJs3j2ymQGDJkiVSv359adCgQQqjMQQE0iHw5ZdfSps2bdIZjFEgkAKB1atXy5o1a6RZs2YpjMYQ+RJAuvIlxX0JAaSLQsgiAaQri1kt7jkhXTbzj3TZzIvZqJAus6khsFoQQLpqAY+mJgkgXSbTko2vAWJ7Mb3iQrrSY81I6RFAutJjzUjpEEC60uG8uaOw0rW5xIr8fqSryAsgo9NHujKa2CKeFtJlM/lIl828mI0K6TKbGgKrBQGkqxbwaGqSANJlMi1sL9pMi92okC67uSEydwJIlzs7WtokgHTZzAsrXTbzYjYqpMtsagisFgSQrlrAo6lJAkiXybSw0mUzLXajQrrs5obI3AkgXe7saGmTANJlMy+sdNnMi9mokC6zqSGwWhBAumoBj6YmCSBdJtNia6Vr1apVsu+++25A6s4775Tjjz++Wno8MiK94kK60mPNSOkRQLrSY81I6RBAutLhvLmjmFrpWrhwoZx00kkyadKk8nnUrVtXSktLka7Nzayn+5EuT2DpNigBpCsofgb3QADp8gC1AF2akq45c+bI1VdfLc8///xmTY2Vrs3CVaubka5a4aOxUQJIl9HEEJYzAaTLGZ3Xhqaka+rUqdKrVy/p2rWrtGzZUo466ijZfvvtawSAdNWIqGA3IF0FQ0lHhgggXYaSQSgFIYB0FQRjwTsxJV1fffWVjBw5Urbaait59913ZcyYMTJgwAA59thjyyd+wgknbAThvffek9dee63WcOrVqydfrKx1N3QAAWlRb60ZCqvK6snKH8yEQyCREmhUV6RhiZ26jhRjqmGXlZVJSUlJqmMW62CtW7fOa+qmpKtyxCNGjJBx48aJ/t/cpYftK196+P7999/Pa8I13TR08ny588XC9FXTWPw8mwSe73mo7Nr6J2Ymt2j5GunxyHR5b+EyMzERSFwE9ti6idx3dgdp9ZP6cQVexNHqStf3338vTZs2LWIK6U09X7k1LV1TpkyRnj17yqxZs6olV8jtxaGvfig3j3kvvUwxUuYIvNiri7Rv08TMvBYtWyMXPTJNZn7ynZmYCCQuAgfs0FyGnt1RWjVBumLJHNuLNjNlSrr0IH3btm2lefPmoluN119/vTRq1Ej0sRHVXUiXzeIq1qiQrmLNfHbnjXTFl1uky2bOTEnXCy+8IEOGDJGlS5eKnq/S81vnn3++tGjRAumyWT9EVQUBpIuyyBoBpCu+jCJdNnNmSrpyiJYvXy6NGzfO+wAgK102i6tYo0K6ijXz2Z030hVfbpEumzkzKV2biwrp2lxi3O+TANLlky59hyCAdIWgXrsxka7a8fPVGumqRJaD9L5KrXj6RbqKJ9fFMlOkK75MI102c4Z0IV02KzPiqJCuiJNH6FUSQLriKwyky2bOkC6ky2ZlRhwV0hVx8ggd6cpIDSBdNhOJdCFdNisz4qiQroiTR+hIV0ZqAOmymUikC+myWZkRR4V0RZw8Qke6MlIDSJfNRCJdSJfNyow4KqQr4uQROtKVkRpAumwmEulCumxWZsRRIV0RJ4/Qka6M1ADSZTORSBfSZbMyI44K6Yo4eYSOdGWkBpAum4lEupAum5UZcVRIV8TJI3SkKyM1gHTZTCTShXTZrMyIo0K6Ik4eoSNdGakBpMtmIpEupMtmZUYcFdIVcfIIHenKSA0gXTYTiXQhXTYrM+KokK6Ik0foSFdGagDpsplIpAvpslmZEUeFdEWcPEJHujJSA0iXzUQiXUiXzcqMOCqkK+LkETrSlZEaQLpsJhLpQrpsVmbEUSFdESeP0JGujNQA0mUzkUgX0mWzMiOOCumKOHmEjnRlpAaQLpuJRLqQLpuVGXFUSFfEySN0pCsjNYB02Uwk0oV02azMiKNCuiJOHqEjXRmpAaTLZiKRLqTLZmVGHBXSFXHyCB3pykgNIF02E4l0IV02KzPiqJCuiJNH6EhXRmoA6bKZSKQL6bJZmRFHhXRFnDxCR7oyUgNIl81EIl1Il83KjDgqpCvi5BE60pWRGkC6bCYS6UK6bFZmxFEhXREnj9CRrozUANJlM5FIF9JlszIjjgrpijh5hI50ZaQGkC6biUS6kC6blRlxVEhXxMkjdKQrIzWAdNlMJNKFdNmszIijQroiTh6hI10ZqQGky2YikS6ky2ZlRhwV0hVx8ggd6cpIDSBdNhOJdCFdNisz4qiQroiTR+hIV0ZqAOmymUikC+myWZkRR4V0RZw8Qke6MlIDSJfNRCJdSJfNyow4KqQr4uQROtKVkRpAumwmEulCumxWZsRRIV0RJ4/Qka6M1ADSZTORSBfSZbMyI44K6Yo4eYSOdGWkBpAum4lEupAum5UZcVRIV8TJI3SkKyM1gHTZTCTShXTZrMyIo0K6Ik4eoSNdGakBpMtmIpEupMtmZUYcFdIVcfIIHenKSA0gXTYTiXQhXTYrM+KokK6Ik0foSFdGagDpsplIpAvpslmZEUeFdEWcPEJHujJSA0iXzUQiXUiXzcqMOCqkK+LkETrSlZEaQLpsJhLpQrpsVmbEUSFdESeP0JGujNQA0mUzkUgX0mWzMiOOCumKOHmEjnRlpAaQLpuJRLqQLpuVGXFUSFfEySN0pCsjNYB02Uwk0oV02azMiKNCuiJOHqEjXRmpAaTLZiKRLqTLZmVGHBXSFXHyCB3pykgNIF02E2lautasWSNbbLGFlJSUVEuvffv2Mm/evIIQHvrqh3LzmPcK0hedFCcBpKs4857lWR+wQ3MZenZHadWkfpanmam5IV0202lWusaOHSuXX365TJgwQdq1a4d02awfoqqCANJFWWSNANIVX0aRLps5MyldixYtktNOO00WL14szzzzDNJls3aIahMEkC5KI2sEkK74Mop02cyZSenq0aOHHHvssfLXv/5VHn74YaTLZu0QFdJFDRQJAaQrvkQjXTZzZk66nn76adGtxWHDhsnBBx8sjz32GNJls3aICumiBoqEANIVX6KRLps5MyVdn332mXTt2lVGjRolbdq0qVK6+vfvvxHJ4cOHy7Rp02pNuLS0VEZM/0IGjp1b677ooHgJjLviMNm6kZ35L1tbIr9/4i2Z+cl3doIikqgIqHTd1W0faVKvLKq4iznYdevWyfr166VevXrFjCG1uTdr1iyvscxIlxbHeeedJ506dZJTTjklCf7444+Xu+++W/bcc09p2LBh8m8TJ07caGLdu3eXt956K68JV3dTnTp15MGpC2TA2Dm17osOipfA+Cs7y47NtzAD4NtV6+TSx2YhXWYyEl8gKl33nLG/bNmwNL7gizTitWvXiv7XqJGhd4AZzkWDBg3ymp0Z6Vq+fLkccMABVQatq1vdunXb5IR4ZEReueamlAhwkD4l0AyTGgG2F1NDXbCB2F4sGMqCdmRGuqqaFWe6CpprOkuJANKVEmiGSY0A0pUa6oINhHQVDGVBOzIvXSNHjpSddtqp2kmz0lXQmqCzWhJAumoJkObmCCBd5lJSY0BIV42IgtxgWrryJYJ05UuK+9IggHSlQZkx0iSAdKVJuzBjIV2F4VjoXpCuSkT5GqBCl1jx9Yd0FV/Osz5jpCu+DCNdNnOGdCFdNisz4qiQroiTR+hVEkC64isMpMtmzpAupMtmZUYcFdIVcfIIHenKSA0gXTYTiXQhXTYrM+KokK6Ik0foSFdGagDpsplIpAvpslmZEUeFdEWcPEJHujJSA0iXzUQiXUiXzcqMOCqkK+LkETrSlZEaQLpsJhLpQrpsVmbEUSFdESeP0JGujNQA0mUzkUgX0mWzMiOOCumKOHmEjnRlpAaQLpuJRLqQLpuVGXFUSFfEySN0pCsjNYB02Uwk0oV02azMiKNCuiJOHqEjXRmpAaTLZiKRLqTLZmVGHBXSFXHyCB3pykgNIF02E4l0IV02KzPiqJCuiJNH6EhXRmoA6bKZSKQL6bJZmRFHhXRFnDxCR7oyUgNIl81EIl1Il83KjDgqpCvi5BE60pWRGkC6bCYS6UK6bFZmxFEhXREnj9CRrozUANJlM5FIF9JlszIjjgrpijh5hI50ZaQGkC6biUS6kC6blRlxVEhXxMkjdKQrIzWAdNlMJNKFdNmszIijQroiTh6hI10ZqQGky2YikS6ky2ZlRhwV0hVx8ggd6cpIDSBdNhOJdCFdNisz4qiQroiTR+hIV0ZqAOmymUikC+myWZkRR4V0RZw8Qke6MlIDSJfNRCJdSJfNyow4KqQr4uQROtKVkRpAumwmEulCumxWZsRRIV0RJ4/Qka6M1ADSZTORSBfSZbMyI44K6Yo4eYSOdGWkBpAum4lEupAum5UZcVRIV8TJI3SkKyM1gHTZTCTShXTZrMyIo0K6Ik4eoSNdGakBpMtmIpEupMtmZUYcFdIVcfIIHenKSA0gXTYTiXQhXTYrM+KokK6Ik0foSFdGagDpsplIpAvpslmZEUeFdEWcPEJHujJSA0iXzUQiXUiXzcqMOCqkK+LkETrSlZEaQLpsJhLpQrpsVmbEUSFdESeP0JGujNQA0mUzkUgX0mWzMiOOCumKOHmEjnRlpAaQLpuJRLqQLpuVGXFUSFfEySN0pCsjNYB02Uwk0oV02azMiKNCuiJOHqEjXRmpAaTLZiKRLqTLZmVGHBXSFXHyCB3pykgNIF02E4l0IV02KzPiqJCuiJNH6EhXRmoA6bKZSKQL6bJZmRFHhXRFnDxCR7oyUgNIl81EIl1Il83KjDgqpCvi5BE60pWRGkC6bCYS6UK6bFZmxFEhXREnj9CRrozUANJlM5FIF9JlszIjjgrpijh5hI50ZaQGkC6biUS6kC6blRlxVEhXxMkjdKQrIzWAdNlMJNKFdNmszIijQroiTh6hI10ZqQGky2YikS6ky2ZlRhwV0hVx8ggd6cpIDSBdNhOJdCFdNisz4qiQroiTR+hIV0ZqAOmymUiT0rVq1SqpW7eu1KtXLy9q7du3l3nz5uV1b003DX31Q7l5zHs13cbPIbBJAkgXxZE1Agfs0FyGnt1RWjWpn7WpZXY+SJfN1JqSrilTpkj//v1l4cKFCa0uXbpIv379pFmzZtXSQ7psFlexRoV0FWvmsztvpCu+3CJdNnNmSrrefPNNWbt2rXTs2FF0teuyyy6TTp06yYUXXoh02awfoqqCANJFWWSNANIVX0aRLps5MyVdlRH17t1bmjdvLtdeey3SZbN+iArpogaKgADSFV+SkS6bOTMnXYsXL5YPP/xQpk6dKo8++qgMHz5cdt99d6TLZv0QFdJFDRQBAaQrviQjXTZzZk66XnjhBRkyZIjMnz9fzjrrrGSLsVWrVuX0XnzxxY1IXnrppTJ79uxaE65Tp44Mf+NTGTB2Tq37ooPiJTD+ys6yQ7P8PgSSBqXvVq+XSx+bJTM/+S6N4RgjgwRUuu45Y39p3qCOmdnp32uuTRP44YcfZN26dVK/Ph9+qKlO1q9fX9MtNf68YcOGNd6jN5iTrlzUS5YskRtvvDEpmsGDB5dP5tZbb91oYsOGDZPp06fnNeHqbiotLZVHpi2UgWPn1rovOiheAuOuOEy2aVxiBsDStSXy+8dnI11mMhJfICpdd522rzStV2Ym+M9XlMmYt74wE4/FQDRbdv4S2SPUqH6pnN1x68Qzans1bdo0ry6cpWvlypXSqFGjvAZxvWnSpElyxRVXyKxZs6rtgk8vuhKmnQ8CHKT3QZU+QxKwuL0478tlcvSdr4bEwtiRE+hz3B5yUZedU51FtdJ12223SevWreXcc8/dKKj9999fRo8eLdtuu23BAp4xY4bsscceost0erarb9++snz5cvn73/+OdBWMMh35JoB0+SZM/2kTQLrSJs54aRCIRro+/fRTOfLII+XVV1+VrbfeumBs9BldenC+RYsWyYNRdYwePXrUKHasdBUsBXRUAAJIVwEg0oUpAkiXqXQQTIEImJEuXXG64YYbZMGCBdKgQYMNDrLr4bwPPvhAOnToICNHjizQ1P9/N/qcLl3d2nLLLfPuG+nKGxU3pkAA6UoBMkOkSgDpShU3g6VEwIx06Un+d955R2655RZp0qSJHHLIIeUISkpKRCVHH2CqX9Vj4UK6LGSBGHIEkC5qIWsEkK6sZZT5KAEz0pVLx/vvv5+sOh144IGmM4R0mU5P0QWHdBVdyjM/YaQr8ykuygmak65cFvTjlIsWLdooKW3atBFd+Qp9IV2hM8D4FQkgXdRD1gggXVnLKPMxudKlj4UYOHCgPPfcc7JixYqNsjRt2rQav4w6jdQiXWlQZox8CSBd+ZLivlgIIF2xZIo4N4eAuZWucePGJY9t0O8+3GeffTZa1dpxxx3FwlOBka7NKTPu9U0A6fJNmP7TJoB0pU2c8dIgYE66+vTpk3yK8Oqrr05j/s5jIF3O6GjogQDS5QEqXQYlgHQFxc/gngiYk65BgwYlXz6t34Vo+UK6LGen+GJDuoov51mfMdKV9QwX5/zMSdfHH38sp512mtx8883JIyLy/W6htNOHdKVNnPGqI4B0UR9ZI4B0ZS2jzEcJmJMu/fqdAQMGbDI7HKSncCGwMQGki6rIGgGkK2sZZT4mpeujjz6Szz//fJPZOeigg6S0tDR49ljpCp4CAqhAAOmiHLJGAOnKWkaZj0npiiUtSFcsmSqOOJGu4shzMc0S6SqmbBfPXM1tL06aNElmzpy5yQxcfPHFyXczhr6QrtAZYPyKBJAu6iFrBJCurGWU+Zhc6XryySdl/Pjx5dn5/vvvZf78+fLFF1/IEUcckXyqsWHDhsGzh3QFTwEBsL1IDWSYANKV4eQW8dTMrXRVlYuysjK55557ZPLkyTJy5EgT6UK6TKSBIH4kwEoXpZA1AkhX1jLKfEyudG0qLd98840cfPDB8tprr0nr1q2DZw/pCp4CAmClixrIMAGkK8PJLeKpRbHSpfnR7cXOnTvLP//5T9lqq62CpwzpCp4CAkC6qIEME0C6MpzcIp6aOemaNWuWzJ07d4OUfPvttzJ69Ghp3Lgx24tFXKxMfdME2F6kOrJGAOnKWkaZj8ntxaFDh24kVipbv/jFL+TEE0+Udu3amcgcK10m0kAQPxJAuiiFrBFAurKWUeZjUrpiSQvSFUumiiNOpKs48lxMs0S6iinbxTNXc9uLOfTLli1LHhWxfPly2WabbWTHHXeUkpISM5lBusykgkBEBOmiDLJGAOnKWkaZj9mVrhEjRsiNN964QYY6dOggt99+u2y33XYmMod0mUgDQbC9SA1klADSldHEFvm0zK10vf7663LuuedKr169kk8rtm3bVmbMmCEPPvigrFu3TvThqRYupMtCFoghR4CVLmohawSQrqxllPmYXOnq37+/zJs3T4YPH75BhmbPni1du3aVqVOnSsuWLYNnD+kKngICqEAA6aIcskYA6cpaRpmPSenq3bu3NG3aVK677roNMrR48WI56KCD5OWXX5YddtghePaQruApIACkixrIMAGkK8PJLeKpmdtefPjhh+WOO+5Ivn+xTZs2SWp0W3Hw4MHyxBNPyBtvvGEiXUiXiTQQxI8EWOmiFLJGAOnKWkaZj8mVrlWrVkm3bt1kzpw5suuuuyYH5ydOnJhk6/7775fDDz/cROaQLhNpIAikixrIKAGkK6OJLfJpmVvp0nysXbtWRo0aJW+//basWLFC9tprL+nSpYvsvPPOZtKFdJlJBYHwyAhqIIMEkK4MJpUpiTnp+vzzz6VOnTrJpxYrXkuWLJHPPvtM9txzTxNpQ7pMpIEgWOmiBjJKAOnKaGKLfFrmpOu2226TBQsWyJAhQzZIzbvvvpt8DZBuO5aWlgZPG9IVPAUEUIEAZ7ooh6wRQLqyllHmowTMSZc+n2uXXXaRnj17bpChNWvWyN577y0TJkww8f2LSBe/QJYIIF2WskEshSCAdBWCIn1YI2BOum699VaZPHmyPP/88xuwmjJlivzud7+TadOmSbNmzYJzRLqCp4AAWOmiBjJMAOnKcHKLeGrmpGvWrFly6qmnJv8dccQRyfcuzpw5Ux555JHk/+/Tp4+JdCFdJtJAED8SYKWLUsgaAaQraxllPia3FzWo0aNHS79+/eSbb74pz9JRRx0lN910k7Rq1cpE5pAuE2kgCKSLGsgoAaQro4kt8mmZW+nK5WP9+vXy9ddfy9KlS2XrrbeWxo0bm0oV0mUqHUUfDCtdRV8CmQOAdGUupUzI4kH6WLKCdMWSqeKIE+kqjjwX0yyRrmLKdvHM1exKl/UUIF3WM1Rc8SFdxZXvYpgt0lUMWS6+OSJdjjlHuhzB0cwLAaTLC1Y6DUgA6QoIn6G9EUC6HNEiXY7gaOaFANLlBSudBiSAdAWEz9DeCCBdjmiRLkdwNPNCAOnygpVOAxJAugLCZ2hvBJAuR7RIlyM4mnkhgHR5wUqnAQkgXQHhM7Q3AkiXI1qkyxEczbwQQLq8YKXTgASQroDwGdobAaTLES3S5QiOZl4IIF1esNJpQAJIV0D4DO2NANLliBbpcgRHMy8EkC4vWOk0IAGkKyB8hvZGAOlyRIt0OYKjmRcCSJcXrHQakADSFRA+Q3sjgHQ5okW6HMHRzAsBpMsLVjoNSADpCgifob0RQLp+RLtmzRqpU6eO1KtXLy/YSFdemLgpJQJIV0qgGSY1AkhXaqgZKEUCRS9dc+bMkb/85S/y9ttvJ9h/+ctfSt++faVhw4bVpgHpSrFKGapGAkhXjYi4ITICSFdkCSPcvAgUvXTNnj1bFixYIMcdd5ysXLlSLrjgAunWrZucfPLJSFdeJcRNFgggXRayQAyFJIB0FZImfVkhUPTSVTkRQ4YMkWXLlsl1112HdFmpUuKokQDSVSMiboiMANIVWcIINy8CSFcFTGVlZXLSSSfJZZddJkcffXT5T5YsWbIRzI4dO8rcuXPzglzTTcOmfCQDxrxX0238HAKbJDChVxfZtVVjM4S+Xv699BgxXWZ+8p2ZmAgkLgIqXff9toO0/MkWZgL/YNEKOebOV83EQyDxEbj2uD2k+2E7FSRwPYeez1VSpnZj8Lrvvvtk8uTJMnz4cKlbt255hKeccspG0b755pvy+uuv13oWenB/5KyvZeC4wghcrQOigygJjL38UGlZb62Z2FeurytXPvUu0mUmI/EFotI1qOue0qjOD2aC/3ptPTl2yGtm4iGQ+Ahc86vd5PT9W8ratbX/e92qVau8AJiUrtGjR8vAgQPl2WeflRYtWtQ4EQ7S14iIG1IkwPZiirAZKhUCbC+mgplBUibA9qKITJo0SXr37i0PPfSQ7LbbbnmlAOnKCxM3pUQA6UoJNMOkRgDpSg01A6VIoOila9q0aXLhhRfKAw88IHvttVc5+gYNGlSbBqQrxSplqBoJIF01IuKGyAggXZEljHDzIlD00tW9e3eZOHHiRrDeeecd2WKLTR/gRLryqi9uSokA0pUSaIZJjQDSlRpqBkqRQNFLlytrpMuVHO18EEC6fFClz5AEkK6Q9BnbFwGky5Es0uUIjmZeCCBdXrDSaUACSFdA+AztjQDS5YgW6XIERzMvBJAuL1jpNCABpCsgfIb2RgDpckSLdDmCo5kXAkiXF6x0GpAA0hUQPkN7I4B0OaJFuhzB0cwLAaTLC1Y6DUgA6QoIn6G9EUC6HNEiXY7gaOaFANLlBSudBiSAdAWEz9DeCCBdjmiRLkdwNPNCAOnygpVOAxJAugLCZ2hvBJAuR7RIlyM4mnkhgHR5wUqnAQkgXQHhM7Q3AkiXI1qkyxEczbwQQLq8YKXTgASQroDwGdobAaTLES3S5QiOZl4IIF1esNJpQAJIV0D4DO2NANLliBbpcgRHMy8EkC4vWOk0IAGkKyB8hvZGAOlyRIt0OYKjmRcCSJcXrHQakADSFRA+Q3sjgHQ5okW6HMHRzAsBpMsLVjoNSADpCgifob0RQLoc0SJdjuBo5oUA0uUFK50GJIB0BYTP0N4IIF2OaJEuR3A080IA6fKClU4DEkC6AsJnaG8EkC5HtEiXIziaeSGAdHnBSqcBCSBdAeEztDcCSJcjWqTLERzNvBBAurxgpdOABJCugPAZ2hsBpMsRLdLlCI5mXgggXV6w0mlAAkhXQPgM7Y0A0uWIFulyBEczLwSQLi9Y6TQgAaQrIHyG9kYA6XJEi3Q5gqOZFwJIlxesdBqQANIVED5DeyOAdDmiRbocwdHMCwGkywtWOg1IAOkKCJ+hvRFAuhzRIl2O4GjmhQDS5QUrnQYkgHQFhM/Q3gggXY5okS5HcDTzQgDp8oKVTgMSQLoCwmdobwSQLke0SJcjOJp5IYB0ecFKpwEJIF0B4TO0NwJIlyNapMsRHM28EEC6vGCl04AEkK6A8BnaGwGkyxEt0uUIjmZeCCBdXrDSaUACSFdA+AztjQDS5YgW6XIERzMvBJAuL1jpNCABpCsgfIb2RgDpckSLdDmCo5kXAkiXF6x0GpAA0hUQPkN7I4B0OaJFuhzB0cwLAaTLC1Y6DUgA6QoIn6G9EUC6HNEiXY7gaOaFANLlBSudBiSAdAWEz9DeCCBdjmiRLkdwNPNCAOnygpVOAxJAugLCZ2hvBJAuR7Q76U2TAAAeWklEQVRIlyM4mnkhgHR5wUqnAQkgXQHhM7Q3AkiXI1qkyxEczbwQQLq8YKXTgASQroDwGdobAaTLES3S5QiOZl4IIF1esNJpQAJIV0D4DO2NANLliBbpcgRHMy8EkC4vWOk0IAGkKyB8hvZGAOlyRIt0OYKjmRcCSJcXrHQakADSFRA+Q3sjgHQ5okW6HMHRzAsBpMsLVjoNSADpCgifob0RQLoc0SJdjuBo5oUA0uUFK50GJIB0BYTP0N4IIF2OaJEuR3A080IA6fKClU4DEkC6AsJnaG8EkC5HtEiXIziaeSGAdHnBSqcBCSBdAeEztDcCSJcjWqTLERzNvBBAurxgpdOABJCugPAZ2hsBpMsRLdLlCI5mXgggXV6w0mlAAkhXQPgM7Y0A0uWIFulyBEczLwSQLi9Y6TQgAaQrIHyG9kYA6XJEi3Q5gqOZFwJIlxesdBqQANIVED5DeyOAdDmiRbocwdHMCwGkywtWOg1IAOkKCJ+hvRFAuhzRIl2O4GjmhQDS5QUrnQYkgHQFhM/Q3gggXSLyww8/yB133CHDhg2TuXPnSp06dWoEjnTViIgbUiSAdKUIm6FSIYB0pYKZQVImUPTStWzZMunevbs0a9ZMXnnlFaQr5QJkuMIQQLoKw5Fe7BBAuuzkgkgKR6DopWv9+vUyfvx46dSpk3Ts2BHpKlxt0VOKBJCuFGEzVCoEkK5UMDNIygSKXrpyvJcsWYJ0pVx8DFc4AkhX4VjSkw0CSJeNPBBFYQkgXT/yrE66evbsuRF1XR375z//Wets1K1bV/7fjK9k4Li5te6LDoqXwLjLD5PWDdaZAbBiXalc/uTbMvOT78zERCBxEVDpGnLq3tK41E5df7W6VH41ZEpcIInWFIFrfrWbnHlg6+QseW2vrbbaKq8uSsrKysryujPFm6qTLj1cX/k6/vjj5d133611hCUlJfLA6x/LgDFzat0XHRQvgQm9uki7rRqYAfDNyh/kkkdnIF1mMhJfICpd9551oLRoVNdM8PMXr5Zj7nzVTDwEEh+Ba4/bXS7otKMUQoPq1auXF4DopKuqWfHpxbxyzU0pEWB7MSXQDJMaAbYXU0PNQCkSYHvxR9ic6Uqx6hiq4ASQroIjpcPABJCuwAlgeC8EkC6ky0th0Wm6BJCudHkzmn8CSJd/xoyQPgGky5E524uO4GjmhQDS5QUrnQYkgHQFhM/Q3gggXY5okS5HcDTzQgDp8oKVTgMSQLoCwmdobwSQLke0SJcjOJp5IYB0ecFKpwEJIF0B4TO0NwJIlyNapMsRHM28EEC6vGCl04AEkK6A8BnaGwGkyxEt0uUIjmZeCCBdXrDSaUACSFdA+AztjQDS5YgW6XIERzMvBJAuL1jpNCABpCsgfIb2RgDpckSLdDmCo5kXAkiXF6x0GpAA0hUQPkN7I4B0OaJFuhzB0cwLAaTLC1Y6DUgA6QoIn6G9EUC6HNEiXY7gaOaFANLlBSudBiSAdAWEz9DeCCBdjmiRLkdwNPNCAOnygpVOAxJAugLCZ2hvBJAuR7RIlyM4mnkhgHR5wUqnAQkgXQHhM7Q3AkiXI1qkyxEczbwQQLq8YKXTgASQroDwGdobAaTLES3S5QiOZl4IIF1esNJpQAJIV0D4DO2NANLliBbpcgRHMy8EkC4vWOk0IAGkKyB8hvZGAOlyRIt0OYKjmRcCSJcXrHQakADSFRA+Q3sjgHQ5okW6HMHRzAsBpMsLVjoNSADpCgifob0RQLoc0SJdjuBo5oUA0uUFK50GJIB0BYTP0N4IIF2OaJEuR3A080IA6fKClU4DEkC6AsJnaG8EkC5HtEiXIziaeSGAdHnBSqcBCSBdAeEztDcCSJcjWqTLERzNvBBAurxgpdOABJCugPAZ2hsBpMsRLdLlCI5mXgggXV6w0mlAAkhXQPgM7Y0A0uWIFulyBEczLwSQLi9Y6TQgAaQrIHyG9kYA6XJEi3Q5gqOZFwJIlxesdBqQANIVED5DeyOAdDmiRbocwdHMCwGkywtWOg1IAOkKCJ+hvRFAuhzRIl2O4GjmhQDS5QUrnQYkgHQFhM/Q3gggXY5okS5HcDTzQgDp8oKVTgMSQLoCwmdobwSQLke0SJcjOJp5IYB0ecFKpwEJIF0B4TO0NwJIlyNapMsRHM28EEC6vGCl04AEkK6A8BnaGwGkyxEt0uUIjmZeCCBdXrDSaUACSFdA+AztjQDS5YgW6XIERzMvBJAuL1jpNCABpCsgfIb2RgDpckSLdDmCo5kXAkiXF6x0GpAA0hUQPkN7I4B0OaJFuhzB0cwLAaTLC1Y6DUgA6QoIn6G9EUC6HNEiXY7gaOaFANLlBSudBiSAdAWEz9DeCCBdjmiRLkdwNPNCAOnygpVOAxJAugLCZ2hvBJAuR7RIlyM4mnkhgHR5wUqnAQkgXQHhM7Q3AkiXI1qkyxEczbwQQLq8YKXTgASQroDwGdobAaTLES3S5QiOZl4IIF1esNJpQAJIV0D4DO2NANLliBbpcgRHMy8EkC4vWOk0IAGkKyB8hvZGAOlyRIt0OYKjmRcCSJcXrHQakADSFRA+Q3sjgHQ5okW6HMHRzAsBpMsLVjoNSADpCgifob0RQLoc0SJdjuBo5oUA0uUFK50GJIB0BYTP0N4IIF2OaJEuR3A080IA6fKClU4DEkC6AsJnaG8EkC5HtEiXIziaeSGAdHnBSqcBCSBdAeEztDcCSJcjWqTLERzNvBBAurxgpdOABJCugPAZ2hsBpMsRLdLlCI5mXgggXV6w0mlAAkhXQPgM7Y0A0vUj2nXr1snq1aulcePGecFGuvLCxE0pEUC6UgLNMKkRQLpSQ81AKRJAukTk7rvvlkGDBiXC9dOf/rT8f1eXB6QrxSplqBoJIF01IuKGyAggXZEljHDzIlD00jV79my56KKLZNSoUdKyZUv54x//KNtvv71cddVV1QJEuvKqL25KiQDSlRJohkmNANKVGmoGSpFA0UvX4MGDZenSpXL99dcn2KdPny5XXnmlTJ48GelKsRAZqnYEkK7a8aO1PQJIl72cEFHtCRS9dPXp00d23313OeeccxKaX331lRx66KEyZ84cKS0t3SRhVrpqX3z0UDgCSFfhWNKTDQJIl408EEVhCRS9dPXs2VO6dOki3bp1S8jqqleHDh1k1qxZ5YfqH3744Y2o9+vXT+bNm1eQbAx99UN5c8F3BemLToqTwJVHtZf2bZqYmfyiZWuk73PvmImHQOIk0PeEvaRVk/pmgp/35TIZ9FJh/u6bmRSBpEpgv+2by0Vddk51zJKysrKyVEesZjDdVtxll13kvPPOS+764osvpHPnzjJ37lypU6dO8m+bki4rcyAOCEAAAhCAAASKi0C+Cz+mpOvee++VhQsXyk033ZRk64033pCrr766xjNdxZXasLPV3Bx55JGJDHNBICsEjj76aHnxxRezMh3mAYHkdfMf//iH/PnPf4aGIQKmpOuDDz6QU045RUaPHi1t27aVyy+/XHbaaadEvLhsEEC6bOSBKApLAOkqLE96C08A6Qqfg6oiMCVdGqBuH95xxx1JrAceeKDceeed0qxZM5v0ijAqpKsIk14EU0a6iiDJRTZFpMtmws1Jl2La3CfS20SbzaiQrmzmtdhnhXQVewVkb/5Il82cmpQum6iISgkgXdRBFgkgXVnManHPCemymX+ky2ZeiAoCEIAABCAAgYwRQLoyllCmAwEIQAACEICATQJIl828EBUEIAABCEAAAhkjgHRlLKFMBwIQgAAEIAABmwSQLpt5CRbVqFGjZP/995cdd9wxWAwMDIG0CYwbN07atWsnu+22W/Kdr1tuuaXUq1cv7TAYDwLVEtBP9t9///3Jt7bUr7/xVzLlU7s19UEK/BJAuvzyja73008/XS6++GI54ogjooudgCHgSuCSSy6RX/3qV3LiiSdK165dpUePHnLMMce4dkc7CHghsGbNGtl7771l+vTp0rRp043GyKd2a+rDS+B0Wk4A6aIYNiCAdFEQxUigonR988030rx5cyktLS1GFMzZMIGahCmf2q2pD8PTz0RoSFcm0li4SVSUriVLlsjNN98s48ePlyZNmsjZZ58tF154odxzzz1St27dZEVML/26pm233VZ69+6d/P99+vSRTp06yfHHH1+4wOip6Al8//33yUqU1qh+T2ujRo3kiiuukGXLlsl9990nP/nJT6R79+5y8sknJ6yWLl0qffv2lVdeeUVat24t1113nRx++OHJz2bMmCE33HCDLFiwQLp06SKLFy+WU089Nelf+zzjjDPk4IMPlr/97W/JVuNZZ52VtPv3v/+d9DNy5Mikj8cffzzZhnzhhReSryzT59j9z//8jzz33HPJNr3+bug3a3BBoBAEcsKkX42nNah/h3OrW9p/xdr98ssvk/rNfaeo/r60b99ehgwZkqyWbaqPQsRJH5smgHRRHZtc6brrrrvkX//6l/zpT38q/yJy/Yqmr7/+OjlX8NRTT8m3334rP/vZz6Rx48bJi1BZWZl06NAh+YOw++67QxcCBSOQe8H5+c9/nrxgTJs2Ta6//no55JBDEtH/5JNPklrVf9cXI63fd999N/k33Y65/fbb5bXXXktWsI499thEts4880wZM2ZM8tVj+vOc1OW22Pv16yctW7YUXQnTa86cOcl5mjfeeEOmTp0q55xzTvIz/c7YoUOHJhKm/3buuefKE088IfPmzUuEkAsChSCQ+x049NBD5Y9//KP88MMPyZuFl156KTmHW/FN8y233CIrVqyQ//qv/5KxY8cmbzL0b/TatWsT6dpUH4WIkz6QLmogTwIVf2l/85vfJC9m+qKm1+DBg2X16tVy/vnnJytZs2bNSr7FXt9JvfXWW8nP9V2/9qG/3HXq1MlzVG6DQM0Eci84Dz30UFJ/+uKx5557Ju/mf/nLXyYd6OqUviHQFxWtX11p2muvvZKfHXfccTJ8+HDZZpttkhccrdkGDRokP9PzW5dddpmTdL355pvJqpvW/GmnnSavv/66tGrVSj7++GM56qij5P3335eSkpKaJ8gdEKiBQO53QGs8t2qrdatvILT2Kv791v+tbypOOOEE0VWvww47LPmbrW9I9PdjU32QBL8EWOnyyze63nO/tPvuu68cdNBByapB7gvHX375ZRk0aJA8//zzyUqBrjI8+eSTyS/z3Llzk/v0P/3F1lUDLggUkkDuBUdXmVq0aJF0rVt4zzzzTPLJw5w8DRgwQHbeeedkBVb/PXc2Sz+19Yc//CERIF0F0zrOXfpGQle5alrpeuedd+SCCy4oX+nS7URdRdBLfwf0hU/rXy9dEdY3LPrvvAEpZCUUb1+53wFdZdUVWL30jbHuKugKa0Xp0mMgEyZMSI6ETJo0SXR7Xt8Y19RH8dJNZ+ZIVzqcoxkl90vbuXPn5BdZpUpf2PQaNmxYsoWjWy4DBw5MVgl05UBXunQbRc8K6NkZ/eRj7lxNNBMnUPMEqjoArLWpjznJPeJEV6xUunR1a5999km2+yqfqdKVKN3+063CnJDpCtm11167kXTpmUZdxbryyisTPk8//bTceuut5dKlWzjPPvtsuXRpvyqFSJf5cooywKp+BzYlXbq6pSvA3bp1S95A6+qu/s2uqY8owUQUNNIVUbLSCLXiOyVdttbDyfpL/d1330nPnj3lqquukiOPPFImTpyYvBCpmOn5Lf1F1l9svfRQ8XbbbZdGuIxRRAQ2R7r0XKG+w1dh0jMtujKmbwz0sPuqVauSVTCVJ916mTJlSnJvVWe69Pld+mZjxIgRsnLlyuTDIm+//TbSVUR1Z2mqNQlTxb/f+ndZzyvqWS499qEfdtI3GTX1YWm+WYwF6cpiVmsxp8qfXuzVq5fot9XrpYeF9dNg+lA+/WSYvrBdc801yXaLXvrJGT0no58W44JAoQnkI1267d2/f/9kdevzzz9P6lO3YvRS8dItRT1vpW8MVMb0oLG+cWjbtm1yBkwlrOLvwPLly5M3F7o9ox8W0VUD/WRi7iB9dStd+vF9XUFje7HQlVC8/dUkTBVrV3ch9HdBt9j177Wex9UPe+y3334bPeur4mpZ8dJNZ+ZIVzqcox5FVwb08CVP6I46jUUbvNavnmfRh0lWPNC+fv16Uamq6iGTlWHpfQ0bNuTZXUVbRXFNfP78+XLSSSclbxZyZ3IfeOAB0Q996DEQrnAEkK5w7BkZAhCAAAQgUHACn332mfz6179OtsV32GEH+fDDD5OzuHo0RI+JcIUjgHSFY8/IEIAABCAAAS8EdAtdnxW3aNEiOeCAA6Rjx47JB0X4pgUvuPPuFOnKGxU3QgACEIAABCAAAXcCSJc7O1pCAAIQgAAEIACBvAkgXXmj4kYIQAACEIAABCDgTgDpcmdHSwhAAAIQgAAEIJA3AaQrb1TcCAEIQAACEIAABNwJIF3u7GgJAQhAAAIQgAAE8iaAdOWNihshAAEIQAACEICAOwGky50dLSEAAQhAAAIQgEDeBJCuvFFxIwQgAAEIQAACEHAngHS5s6MlBCAAAQhAAAIQyJsA0pU3Km6EAAQgAAEIQAAC7gSQLnd2tIQABCAAAQhAAAJ5E0C68kbFjRCAAAQgAAEIQMCdANLlzo6WEIAABCAAAQhAIG8CSFfeqLgRAhCAAAQgAAEIuBNAutzZ0RIC0RL45ptv5OOPP5Zdd91VmjZtmvc8tN1bb70lRxxxRN5tKt6o7S+++GIZMmSIbL311lX2MWfOHFm3bp3stddeUvn+ij9zCqBSo4ULF8rjjz8uL7/8shxwwAFy0003FaLbTfZR2/lX1fG0adMSng8//LDX2OkcAhCoPQGkq/YM6QEC0RB4//335brrrpNZs2aVx6zi9dxzz0m9evVqnMfTTz8tvXv3lrlz50qdOnVkwYIFsmrVKvmP//iPGtvqDV988YV07txZJkyYIO3atauyjfa/dOlSuffeeze6v+LPtLHOY4cddpCtttoqr/Er36T9qXyqCNavX18OOeQQp37ybVTb+Vc1zksvvSSXXHKJzJs3L98wuA8CEAhEAOkKBJ5hIRCCQNeuXaVhw4by17/+VVq2bCmffPKJfPXVV3LQQQflFc7atWvlyy+/lO222y65v2/fvrLNNtvIRRddlFf7fKRjyZIlsn79etlyyy03kq6KP9MBf/7zn8vtt98uBx54YF7jV7xJV9M6dOggt956qxxzzDGb3d6lQW3nj3S5UKcNBOwQQLrs5IJIIOCdgErK2WefLb/73e82GuvDDz9MVnyefPJJadasWfLz5cuXy8knnyz33HNPshU5ceJE+e///m8ZOXJkshX3yCOPSOPGjaV169byi1/8IlkFmz59erJK9fbbbyd9nHjiiXLFFVdIo0aNyiXq5ptvlqeeekp0u3DvvfeWa6+9NtlO1Ouuu+6S1atXy9VXX72RdOV+pis7F154YTJWixYtki3Ss846Sx599FH529/+tsHK22233SZff/213HLLLeVz1nE1hqlTp5a3Hz58eCKQo0aNkgcffFA++ugj2WWXXRImRx99dNJ20aJFyThDhw6Vv/zlLzJp0qSEwcEHH7wBz++//z6JY9y4cTJ//vyEnbbR1URd6RswYEAy//feey/p+7TTTpOOHTvWOH+9QXMwaNCgJL59991XDjvsMNE55la6zj//fLn00ktFVyXHjBkj3bt3l8suu0xmz56djKtj7rTTTnLllVeWbxMrV82PCvXYsWMTMT/hhBPknHPO2aztZ+8FzAAQiJwA0hV5AgkfAptDQF+sVS50dejII4+U0tLSDUTkN7/5jbzxxhuJiOilK0sqA88++2wiRS+88EIiG3rPBx98IL///e+TLblTTjklEbVtt91WXn/9dZkyZUoiA99++22yndm/f3/59a9/XS5RKmo9evRIZOTvf/97IgIqMNrHn//852TLUkWi8spQ7mcDBw6UN998M5EV/d977LFHIkxnnHFGsvqlwqbXmjVrklU8lTq9N3fpvP79738n/6YisueeeyaxvPjii4mMaD8qkSotKi8qYTqfXDxt27aVTp06ifJS8al8Lu6GG25I5O23v/2tHH744Ylcap8qfypdOn9dHdx///0TgVVBfeWVV5Lwqpu/bg8rRxVOlbXx48cn/PTKSZfOf+XKlcn2rQrY7rvvLnXr1pUuXbokAq0SrFuSKoujR49OBFXHfOyxxxIJO/3005O8aZ7PO++8hAcXBCBQGAJIV2E40gsEoiCgL8a6taiHrvVFWVdB9IVY5UvFYHOkSyesL9D6Il/d9qKufulWnopeTlpUdFTUKoqdvsirBOUjXSpkKmYqPHoQPre9OGLEiGSF6bXXXkvmpCKjcqeHzXOrd7lEVdX+P//zPxNxvPvuu8vzqas9eimzXPynnnpqslJW1ZUTVZVNlZaKV669rropd730TNlRRx2VrDCp+FU3fx3ziSeekJkzZ0pJSUnSvl+/fklsFaVLt4FfffXV8nvuuOOOpF2Oi27fKjNd9fzDH/6QjPmPf/wjWUXLibjmTVf2clIXRYETJASME0C6jCeI8CDgg4AehFdB0VUWPc+kklEo6VLp0IP5unqlL9rar662DBs2rFxadGtNhSl3aQy6cqMrVLWRLl2h+dnPfiYPPPBAsrJz1VVXJcI3ePDgjTBWli4Vkd122y0ZX2Ukd+l2qv6nh/Zz0pRb+aoqN++8846ovD3zzDPJ1mlV0pVbOcz9rH379vLQQw8lq2fVzV8leYsttthACnWcP/3pTxtIl24NVlyh0u1Y3Ur96U9/Wh7Ov/71r2QVTtnomJor3RbOXbodqtKqNcIFAQgUhgDSVRiO9AKBKAno6oauUumqiMpS5ZUufSFWEahqe1EnXHmlSx+JoNtfuv123HHHJdt2ekZMV9gqSpduvenPcpdKSm4bsDbSpf2paOmlK2f77bdfMm5Vj7ioLF0qZyqHuRW3XGwqQ3feeecG0lXdpy9nzJiRrNjpVqWenapKuiq3z1e6dOWsTZs2G5xP0xWyyy+/fAPp0jNduZVEHV+3GRcvXpxsS1a8NE+6fVyROdIV5a8yQUdCAOmKJFGECQEfBD799NPkbJeuDOmZqGOPPVaef/75RD700oPqKlbVSZeeUdKD2nrpma9evXol21S6TaeXipy+uFeULn2ulI6l14oVK5KzTddff31ycDtf6dLD9vvss09yeF5Xt3KXruhoP7qFp4f9dWtRzzRVvqraXtRzVxq3boXmLl0l0k946upcPp8+1MPoev5Lz5rpp0ULKV3KSNlOnjy5vFtdhdMt44rbi5WlS7cgNX7NZ8VzfLlOkC4fv130CYGNCSBdVAUEioSAyo2uOukn7fTThvrpNz2vowfSdVWmrKws2fLTVRpdOdGD8rkVnk1Jl4qFblNpP/qJN22jB7VV4ioevNeVporSpefJ9MyQypKuJOkWl4qEylm+0pUTOu3jmmuuSbKonw7UbULdNlu2bFkyFz1EX9VVlXTlnkOmB+H1nJV++lA/BJA7g5WPdOlYurKkQqurbvrQVd1i1cdT6KdBq3pOWb4rXfohhXPPPTcRWxU6Za8rc7rCWJ105baOlceZZ56ZHOTXh9zqdqqOjXQVyR8BphmcANIVPAUEAIF0COj2oa5aqRjppS+8utqkj0DInT3Slar77rsvkQS9VGZUrHLbgRU/vag/1/v0Hj3HpAfqta0+u0tXn/TSFSw9r6WrTfqznLSo1OhqkoqgxlHxWVlVSVduq66yHOTiUemoeHA9t/pT+exURdK5lbKKB/H15/fff3/yn/apl4qTbsHqClEu/qq2Div2rStjGo+uSumlMqkH2fXwu0pX5fY1SVfF+1VelZde+ilT3TJUKawoXbryWHmVTT+xeOONNyZzyOVfz22ptFYlXTqOPqmfM13p/H4ySnEQQLqKI8/MEgLlBFR09IyVvmDrU+WruvT8j37ar6qtqE3d36RJk/Kn2utzqvQTdCpUm7p0Req7775LHoKa+ySeS5r0LJYeoNen0ufmo9uXetZJ/3O9tE9lsClG+fSrYqePraj8ycl82lZ3j85Zn9qv7Db3UvnWVU19zEVt5ra543I/BCAggnRRBRCAQKYIfPbZZ8nB+YrnxjI1QSYDAQhESwDpijZ1BA4BCFQkoE/U79OnT3JYPPcYDAhBAAIQsEQA6bKUDWKBAAScCeiW6f/+7//K9ttvn3x9DxcEIAABawSQLmsZIR4IQAACEIAABDJJAOnKZFqZFAQgAAEIQAAC1gggXdYyQjwQgAAEIAABCGSSANKVybQyKQhAAAIQgAAErBFAuqxlhHggAAEIQAACEMgkAaQrk2llUhCAAAQgAAEIWCPwf8g8D3s2vl/NAAAAAElFTkSuQmCC",
      "image/svg+xml": [
       "<svg class=\"main-svg\" xmlns=\"http://www.w3.org/2000/svg\" xmlns:xlink=\"http://www.w3.org/1999/xlink\" width=\"350\" height=\"250\" style=\"\" viewBox=\"0 0 350 250\"><rect x=\"0\" y=\"0\" width=\"350\" height=\"250\" style=\"fill: rgb(255, 255, 255); fill-opacity: 1;\"/><defs id=\"defs-1a9879\"><g class=\"clips\"><clipPath id=\"clip1a9879xyplot\" class=\"plotclip\"><rect width=\"290\" height=\"181\"/></clipPath><clipPath class=\"axesclip\" id=\"clip1a9879x\"><rect x=\"50\" y=\"0\" width=\"290\" height=\"250\"/></clipPath><clipPath class=\"axesclip\" id=\"clip1a9879y\"><rect x=\"0\" y=\"10\" width=\"350\" height=\"181\"/></clipPath><clipPath class=\"axesclip\" id=\"clip1a9879xy\"><rect x=\"50\" y=\"10\" width=\"290\" height=\"181\"/></clipPath></g><g class=\"gradients\"/><g class=\"patterns\"/></defs><g class=\"bglayer\"/><g class=\"layer-below\"><g class=\"imagelayer\"/><g class=\"shapelayer\"/></g><g class=\"cartesianlayer\"><g class=\"subplot xy\"><g class=\"layer-subplot\"><g class=\"shapelayer\"/><g class=\"imagelayer\"/></g><g class=\"gridlayer\"><g class=\"x\"><path class=\"xgrid crisp\" transform=\"translate(98.33,0)\" d=\"M0,10v181\" style=\"stroke: rgb(232, 232, 232); stroke-opacity: 1; stroke-width: 1px;\"/><path class=\"xgrid crisp\" transform=\"translate(195,0)\" d=\"M0,10v181\" style=\"stroke: rgb(232, 232, 232); stroke-opacity: 1; stroke-width: 1px;\"/><path class=\"xgrid crisp\" transform=\"translate(291.66999999999996,0)\" d=\"M0,10v181\" style=\"stroke: rgb(232, 232, 232); stroke-opacity: 1; stroke-width: 1px;\"/></g><g class=\"y\"><path class=\"ygrid crisp\" transform=\"translate(0,156.61)\" d=\"M50,0h290\" style=\"stroke: rgb(232, 232, 232); stroke-opacity: 1; stroke-width: 1px;\"/><path class=\"ygrid crisp\" transform=\"translate(0,122.22)\" d=\"M50,0h290\" style=\"stroke: rgb(232, 232, 232); stroke-opacity: 1; stroke-width: 1px;\"/><path class=\"ygrid crisp\" transform=\"translate(0,87.83)\" d=\"M50,0h290\" style=\"stroke: rgb(232, 232, 232); stroke-opacity: 1; stroke-width: 1px;\"/><path class=\"ygrid crisp\" transform=\"translate(0,53.44)\" d=\"M50,0h290\" style=\"stroke: rgb(232, 232, 232); stroke-opacity: 1; stroke-width: 1px;\"/><path class=\"ygrid crisp\" transform=\"translate(0,19.05)\" d=\"M50,0h290\" style=\"stroke: rgb(232, 232, 232); stroke-opacity: 1; stroke-width: 1px;\"/></g></g><g class=\"zerolinelayer\"/><path class=\"xlines-below\"/><path class=\"ylines-below\"/><g class=\"overlines-below\"/><g class=\"xaxislayer-below\"/><g class=\"yaxislayer-below\"/><g class=\"overaxes-below\"/><g class=\"plot\" transform=\"translate(50,10)\" clip-path=\"url(#clip1a9879xyplot)\"><g class=\"barlayer mlayer\"><g class=\"trace bars\" style=\"opacity: 1;\"><g class=\"points\"><g class=\"point\"><path d=\"M203,181V77.83H280.33V181Z\" style=\"vector-effect: non-scaling-stroke; opacity: 1; stroke-width: 0.5px; fill: rgb(31, 119, 180); fill-opacity: 1; stroke: rgb(255, 255, 255); stroke-opacity: 1;\"/></g><g class=\"point\"><path d=\"M9.67,181V9.05H87V181Z\" style=\"vector-effect: non-scaling-stroke; opacity: 1; stroke-width: 0.5px; fill: rgb(31, 119, 180); fill-opacity: 1; stroke: rgb(255, 255, 255); stroke-opacity: 1;\"/></g><g class=\"point\"><path d=\"M106.33,181V9.05H183.67V181Z\" style=\"vector-effect: non-scaling-stroke; opacity: 1; stroke-width: 0.5px; fill: rgb(31, 119, 180); fill-opacity: 1; stroke: rgb(255, 255, 255); stroke-opacity: 1;\"/></g></g></g></g></g><g class=\"overplot\"/><path class=\"xlines-above crisp\" d=\"M49,191.5H340\" style=\"fill: none; stroke-width: 1px; stroke: rgb(36, 36, 36); stroke-opacity: 1;\"/><path class=\"ylines-above crisp\" d=\"M49.5,10V191\" style=\"fill: none; stroke-width: 1px; stroke: rgb(36, 36, 36); stroke-opacity: 1;\"/><g class=\"overlines-above\"/><g class=\"xaxislayer-above\"><path class=\"xtick ticks crisp\" d=\"M0,192v5\" transform=\"translate(98.33,0)\" style=\"stroke: rgb(68, 68, 68); stroke-opacity: 1; stroke-width: 1px;\"/><path class=\"xtick ticks crisp\" d=\"M0,192v5\" transform=\"translate(195,0)\" style=\"stroke: rgb(68, 68, 68); stroke-opacity: 1; stroke-width: 1px;\"/><path class=\"xtick ticks crisp\" d=\"M0,192v5\" transform=\"translate(291.66999999999996,0)\" style=\"stroke: rgb(68, 68, 68); stroke-opacity: 1; stroke-width: 1px;\"/><g class=\"xtick\"><text text-anchor=\"middle\" x=\"0\" y=\"211.4\" transform=\"translate(98.33,0)\" style=\"font-family: 'Open Sans', verdana, arial, sans-serif; font-size: 12px; fill: rgb(36, 36, 36); fill-opacity: 1; white-space: pre; opacity: 1;\">low</text></g><g class=\"xtick\"><text text-anchor=\"middle\" x=\"0\" y=\"211.4\" transform=\"translate(195,0)\" style=\"font-family: 'Open Sans', verdana, arial, sans-serif; font-size: 12px; fill: rgb(36, 36, 36); fill-opacity: 1; white-space: pre; opacity: 1;\">medium</text></g><g class=\"xtick\"><text text-anchor=\"middle\" x=\"0\" y=\"211.4\" transform=\"translate(291.66999999999996,0)\" style=\"font-family: 'Open Sans', verdana, arial, sans-serif; font-size: 12px; fill: rgb(36, 36, 36); fill-opacity: 1; white-space: pre; opacity: 1;\">high</text></g></g><g class=\"yaxislayer-above\"><path class=\"ytick ticks crisp\" d=\"M49,0h-5\" transform=\"translate(0,191)\" style=\"stroke: rgb(68, 68, 68); stroke-opacity: 1; stroke-width: 1px;\"/><path class=\"ytick ticks crisp\" d=\"M49,0h-5\" transform=\"translate(0,156.61)\" style=\"stroke: rgb(68, 68, 68); stroke-opacity: 1; stroke-width: 1px;\"/><path class=\"ytick ticks crisp\" d=\"M49,0h-5\" transform=\"translate(0,122.22)\" style=\"stroke: rgb(68, 68, 68); stroke-opacity: 1; stroke-width: 1px;\"/><path class=\"ytick ticks crisp\" d=\"M49,0h-5\" transform=\"translate(0,87.83)\" style=\"stroke: rgb(68, 68, 68); stroke-opacity: 1; stroke-width: 1px;\"/><path class=\"ytick ticks crisp\" d=\"M49,0h-5\" transform=\"translate(0,53.44)\" style=\"stroke: rgb(68, 68, 68); stroke-opacity: 1; stroke-width: 1px;\"/><path class=\"ytick ticks crisp\" d=\"M49,0h-5\" transform=\"translate(0,19.05)\" style=\"stroke: rgb(68, 68, 68); stroke-opacity: 1; stroke-width: 1px;\"/><g class=\"ytick\"><text text-anchor=\"end\" x=\"41.6\" y=\"4.199999999999999\" transform=\"translate(0,191)\" style=\"font-family: 'Open Sans', verdana, arial, sans-serif; font-size: 12px; fill: rgb(36, 36, 36); fill-opacity: 1; white-space: pre; opacity: 1;\">0</text></g><g class=\"ytick\"><text text-anchor=\"end\" x=\"41.6\" y=\"4.199999999999999\" style=\"font-family: 'Open Sans', verdana, arial, sans-serif; font-size: 12px; fill: rgb(36, 36, 36); fill-opacity: 1; white-space: pre; opacity: 1;\" transform=\"translate(0,156.61)\">1</text></g><g class=\"ytick\"><text text-anchor=\"end\" x=\"41.6\" y=\"4.199999999999999\" style=\"font-family: 'Open Sans', verdana, arial, sans-serif; font-size: 12px; fill: rgb(36, 36, 36); fill-opacity: 1; white-space: pre; opacity: 1;\" transform=\"translate(0,122.22)\">2</text></g><g class=\"ytick\"><text text-anchor=\"end\" x=\"41.6\" y=\"4.199999999999999\" style=\"font-family: 'Open Sans', verdana, arial, sans-serif; font-size: 12px; fill: rgb(36, 36, 36); fill-opacity: 1; white-space: pre; opacity: 1;\" transform=\"translate(0,87.83)\">3</text></g><g class=\"ytick\"><text text-anchor=\"end\" x=\"41.6\" y=\"4.199999999999999\" style=\"font-family: 'Open Sans', verdana, arial, sans-serif; font-size: 12px; fill: rgb(36, 36, 36); fill-opacity: 1; white-space: pre; opacity: 1;\" transform=\"translate(0,53.44)\">4</text></g><g class=\"ytick\"><text text-anchor=\"end\" x=\"41.6\" y=\"4.199999999999999\" style=\"font-family: 'Open Sans', verdana, arial, sans-serif; font-size: 12px; fill: rgb(36, 36, 36); fill-opacity: 1; white-space: pre; opacity: 1;\" transform=\"translate(0,19.05)\">5</text></g></g><g class=\"overaxes-above\"/></g></g><g class=\"polarlayer\"/><g class=\"smithlayer\"/><g class=\"ternarylayer\"/><g class=\"geolayer\"/><g class=\"funnelarealayer\"/><g class=\"pielayer\"/><g class=\"iciclelayer\"/><g class=\"treemaplayer\"/><g class=\"sunburstlayer\"/><g class=\"glimages\"/><defs id=\"topdefs-1a9879\"><g class=\"clips\"/></defs><g class=\"layer-above\"><g class=\"imagelayer\"/><g class=\"shapelayer\"/></g><g class=\"infolayer\"><g class=\"g-gtitle\"/><g class=\"g-xtitle\"><text class=\"xtitle\" x=\"195\" y=\"239.70625\" text-anchor=\"middle\" style=\"font-family: 'Open Sans', verdana, arial, sans-serif; font-size: 14px; fill: rgb(36, 36, 36); opacity: 1; font-weight: normal; white-space: pre;\">Suitability for children</text></g><g class=\"g-ytitle\" transform=\"translate(5.0654296875,0)\"><text class=\"ytitle\" transform=\"rotate(-90,9.934375000000003,100.5)\" x=\"9.934375000000003\" y=\"100.5\" text-anchor=\"middle\" style=\"font-family: 'Open Sans', verdana, arial, sans-serif; font-size: 14px; fill: rgb(36, 36, 36); opacity: 1; font-weight: normal; white-space: pre;\">count</text></g></g></svg>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "toy_dogs = dogs.query('group == \"toy\"').groupby('kids').count().reset_index()\n",
    "px.bar(toy_dogs, x='kids', y='breed', width=350, height=250,\n",
    "      category_orders={\"kids\": [\"low\", \"medium\", \"high\"]},\n",
    "      labels={\"kids\": \"Suitability for children\", \"breed\": \"count\"})"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We do not always want to have categorical data represented by strings.\n",
    "Strings generally take up more space to store, which can greatly \n",
    "increase the size of a dataset if it contains many categorical features."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "At times, a qualitative feature has many categories and we prefer a higher-level view of the data, so we collapse categories. "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Collapse categories\n",
    "\n",
    "Let's create a new column, called `play`, to represent\n",
    "the groups of dogs whose \"purpose\" is to play (or not). (This is a fictitious\n",
    "distinction used for demonstration purposes.) This category consists of the toy\n",
    "and non-sporting breeds. The new feature, `play`, is a transformation of the feature\n",
    "`group` that collapses categories: toy and non-sporting are combined into one\n",
    "category, and the remaining categories are placed in a second, non-play\n",
    "category.  The boolean (`bool`) storage type is useful to indicate the\n",
    "presence or absence of this characteristic:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 51,
   "metadata": {},
   "outputs": [],
   "source": [
    "with_play = dogs.assign(play=(dogs[\"group\"] == \"toy\") |\n",
    "                             (dogs[\"group\"] == \"non-sporting\"))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Representing a two-category qualitative feature as a boolean has a few\n",
    "advantages. For example, the mean of `play` makes sense because it returns the\n",
    "fraction of `True` values. When booleans are used for numeric calculations,\n",
    "`True` becomes 1 and `False` becomes 0:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 52,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0.22093023255813954"
      ]
     },
     "execution_count": 52,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "with_play['play'].mean()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This storage type gives us a shortcut to compute counts and averages\n",
    "of boolean values. In {numref}`Chapter %s <ch:linear>`, we'll see that it's also a handy \n",
    "encoding for modeling. "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "There are also times, like when a discrete quantitative feature has a long tail, that we want to truncate the higher values, which turns the quantitative feature into an ordinal. We describe this next.  "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Convert quantitative to ordinal\n",
    "\n",
    "Finally, another transformation that we\n",
    "sometimes find useful is to convert numeric values into categories. For\n",
    "example, we might collapse the values in  `ailments` into categories: 0, 1, 2,\n",
    "3, 4+. In other words, we turn `ailments` from a quantitative feature into an\n",
    "ordinal feature with the mapping 0→0, 1→1, 2→2, 3→3, and any value 4 or larger→4+.  We might want to make this transformation because few breeds have\n",
    "more than three genetic ailments. This simplification can be clearer\n",
    "and adequate for an investigation."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    ":::{note}\n",
    "\n",
    "As of this writing (late 2022), `pandas` also\n",
    "implements a `category` `dtype` that is\n",
    "designed to work with qualitative data.\n",
    "However, this storage type is not yet widely\n",
    "adopted by the visualization and modeling libraries, which limits its\n",
    "usefulness. For that reason, we do not transform our qualitative variables into\n",
    "the `category` `dtype`.\n",
    "We expect that future readers may want to use the `category` `dtype` as more\n",
    "libraries support it.\n",
    "\n",
    ":::"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "When we convert a quantitative feature to ordinal, we lose information. We can't go back. That is, if we know the number of ailments for a breed is four or more, we can't re-create the actual numeric value.\n",
    "The same thing happens when we collapse categories. For this reason, it's a good practice to keep the original feature. If we need to check our work or change categories, we can document and re-create our steps. \n",
    "\n",
    "In general, the feature type helps us figure out what kind of plot is most appropriate.\n",
    "We discuss the mapping between feature type and plots next."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## The Importance of Feature Types"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Feature types guide us in our data analysis.  They help specify the operations,\n",
    "visualizations, and models we can meaningfully apply to the data.\n",
    "{numref}`Table %s <feature-plot>` matches the feature type(s) to the various kinds of plots that are typically good options. Whether the variable(s) are quantitative or qualitative generally\n",
    "determines the set of viable plots to make, although there are exceptions.\n",
    "Other factors that enter into the decision are the number of observations and\n",
    "whether the feature takes on only a few distinct values. For example, we might\n",
    "make a bar chart, rather than a histogram, for a discrete quantitative variable."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    ":::{table} Mapping feature types to plots\n",
    ":name: feature-plot\n",
    "\n",
    "| Feature type      | Dimension | Plot     |\n",
    "| :-----        |    :-----   |          :--------- |\n",
    "| Quantitative      | One feature       | Rug plot, histogram, density curve, box plot, violin plot   |\n",
    "| Qualitative   | One feature        | Bar plot, dot plot, line plot, pie chart      |\n",
    "| Quantitative      | Two features       | Scatter plot, smooth curve, contour plot, heat map, quantile quantile plot   |\n",
    "| Qualitative   | Two features        | Side-by-side bar plots, mosaic plot, overlaid lines      |\n",
    "| Mixed      | Two features       | Overlaid density curves, side-by-side box plots, overlaid smooth curves, quantile quantile plot   |\n",
    "\n",
    ":::"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The feature type also helps us decide the kind of summary statistics to\n",
    "calculate. With qualitative data, we usually don't compute means or standard\n",
    "deviations, and instead compute the count, fraction, or percentage of records\n",
    "in each category. With a quantitative feature, we compute the mean or median as\n",
    "a measure of center, and, respectively, the standard deviation or inner\n",
    "quartile range (75th percentile to 25th percentile) as a measure of spread.  In\n",
    "addition to the quartiles, we may find other percentiles informative."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    ":::{note}\n",
    "\n",
    "The *n*th percentile is that value *q* such that *n% of the data\n",
    "values fall at or below it.* The value *q* might not be unique, and there are\n",
    "several approaches to select a unique value from the possibilities. With enough\n",
    "data, there should be little difference between these definitions.\n",
    "\n",
    "To compute percentiles in Python, we prefer using:\n",
    "\n",
    "```python\n",
    "np.percentile(data, method='lower')\n",
    "```\n",
    "\n",
    ":::"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "When exploring data, we need to know how to interpret the shapes that our plots reveal.\n",
    "The next three sections give guidance with\n",
    "this interpretation. We also introduce many of the types of plots listed in {numref}`Table %s <feature-plot>`\n",
    "through the examples. Others are introduced in {numref}`Chapter %s <ch:viz>`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.9.4"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}
